public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
@ 2023-06-14  0:58 pan2.li
  2023-06-14  1:07 ` juzhe.zhong
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: pan2.li @ 2023-06-14  0:58 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, rdapp.gcc, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

From: Pan Li <pan2.li@intel.com>

This patch would like to fix one bug exported by RV32 test case
multiple_rgroup_run-2.c. The mask should be restricted by elen in
vector, and the condition between the vmv.s.x and the vmv.v.x should
take inner_bits_size rather than constants.

Passed both the rv32 and rv64 riscv/rvv tests.

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
	Take elen instead of scalar BITS_PER_WORD.
	(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
	instead of scaler BITS_PER_WORD.
---
 gcc/config/riscv/riscv-v.cc | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index fb970344521..9270e258ca3 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -399,10 +399,11 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const
 {
   unsigned HOST_WIDE_INT mask = 0;
   unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern);
+  unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
 
-  gcc_assert (BITS_PER_WORD % npatterns () == 0);
+  gcc_assert (elen % npatterns () == 0);
 
-  int limit = BITS_PER_WORD / npatterns ();
+  int limit = elen / npatterns ();
 
   for (int i = 0; i < limit; i++)
     mask |= base_mask << (i * npatterns ());
@@ -1923,7 +1924,7 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       rtx mask = gen_reg_rtx (mask_mode);
       rtx dup = gen_reg_rtx (dup_mode);
 
-      if (full_nelts <= BITS_PER_WORD) /* vmv.s.x.  */
+      if (full_nelts <= builder.inner_bits_size ()) /* vmv.s.x.  */
 	{
 	  rtx ops[] = {dup, gen_scalar_move_mask (dup_mask_mode),
 	    RVV_VUNDEF (dup_mode), merge_mask};
@@ -1933,7 +1934,8 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       else /* vmv.v.x.  */
 	{
 	  rtx ops[] = {dup, force_reg (GET_MODE_INNER (dup_mode), merge_mask)};
-	  rtx vl = gen_int_mode (CEIL (full_nelts, BITS_PER_WORD), Pmode);
+	  rtx vl = gen_int_mode (CEIL (full_nelts, builder.inner_bits_size ()),
+				 Pmode);
 	  emit_nonvlmax_integer_move_insn (code_for_pred_broadcast (dup_mode),
 					   ops, vl);
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  0:58 [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32 pan2.li
@ 2023-06-14  1:07 ` juzhe.zhong
  2023-06-14  7:30   ` Li, Pan2
  2023-06-14  7:29 ` [PATCH v2] " pan2.li
  2023-06-14  9:00 ` [PATCH v3] " pan2.li
  2 siblings, 1 reply; 11+ messages in thread
From: juzhe.zhong @ 2023-06-14  1:07 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 2989 bytes --]

>> unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
Add comment here to demonstrate why you pick up elen to set the LIMIT.
I understand:
1. -march=zve32* ===> ELEN = 32
    -march=zve64* ===> ELEN = 64
2. both vmv.v.x/vmv.s.x is restrict to the ELEN
    For example, When ELEN=32 (-march=zve32*)
    vsetvli ...e64,m1
    vmv.v.x/vmv.s.x
    We can't support such code sequence.

You should demonstrate it clearly in the comments.

Otherwise, this patch LGTM.


juzhe.zhong@rivai.ai
 
From: pan2.li
Date: 2023-06-14 08:58
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
From: Pan Li <pan2.li@intel.com>
 
This patch would like to fix one bug exported by RV32 test case
multiple_rgroup_run-2.c. The mask should be restricted by elen in
vector, and the condition between the vmv.s.x and the vmv.v.x should
take inner_bits_size rather than constants.
 
Passed both the rv32 and rv64 riscv/rvv tests.
 
Signed-off-by: Pan Li <pan2.li@intel.com>
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
Take elen instead of scalar BITS_PER_WORD.
(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
instead of scaler BITS_PER_WORD.
---
gcc/config/riscv/riscv-v.cc | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index fb970344521..9270e258ca3 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -399,10 +399,11 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const
{
   unsigned HOST_WIDE_INT mask = 0;
   unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern);
+  unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
-  gcc_assert (BITS_PER_WORD % npatterns () == 0);
+  gcc_assert (elen % npatterns () == 0);
-  int limit = BITS_PER_WORD / npatterns ();
+  int limit = elen / npatterns ();
   for (int i = 0; i < limit; i++)
     mask |= base_mask << (i * npatterns ());
@@ -1923,7 +1924,7 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       rtx mask = gen_reg_rtx (mask_mode);
       rtx dup = gen_reg_rtx (dup_mode);
-      if (full_nelts <= BITS_PER_WORD) /* vmv.s.x.  */
+      if (full_nelts <= builder.inner_bits_size ()) /* vmv.s.x.  */
{
  rtx ops[] = {dup, gen_scalar_move_mask (dup_mask_mode),
    RVV_VUNDEF (dup_mode), merge_mask};
@@ -1933,7 +1934,8 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       else /* vmv.v.x.  */
{
  rtx ops[] = {dup, force_reg (GET_MODE_INNER (dup_mode), merge_mask)};
-   rtx vl = gen_int_mode (CEIL (full_nelts, BITS_PER_WORD), Pmode);
+   rtx vl = gen_int_mode (CEIL (full_nelts, builder.inner_bits_size ()),
+ Pmode);
  emit_nonvlmax_integer_move_insn (code_for_pred_broadcast (dup_mode),
   ops, vl);
}
-- 
2.34.1
 
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  0:58 [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32 pan2.li
  2023-06-14  1:07 ` juzhe.zhong
@ 2023-06-14  7:29 ` pan2.li
  2023-06-14  7:43   ` juzhe.zhong
  2023-06-14  8:27   ` Robin Dapp
  2023-06-14  9:00 ` [PATCH v3] " pan2.li
  2 siblings, 2 replies; 11+ messages in thread
From: pan2.li @ 2023-06-14  7:29 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, rdapp.gcc, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

From: Pan Li <pan2.li@intel.com>

This patch would like to fix one bug exported by RV32 test case
multiple_rgroup_run-2.c. The mask should be restricted by elen in
vector, and the condition between the vmv.s.x and the vmv.v.x should
take inner_bits_size rather than constants.

After this patch, below failures on RV32 will be fixed.

FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/repeat_run-3.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
	Take elen instead of scalar BITS_PER_WORD.
	(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
	instead of scaler BITS_PER_WORD.
---
 gcc/config/riscv/riscv-v.cc | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index e07d5c2901a..db1a5529419 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -399,10 +399,19 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const
 {
   unsigned HOST_WIDE_INT mask = 0;
   unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern);
+  /* We restrict the limit to the elen of RVV. For example:
+     -march=zve32*, the ELEN is 32.
+     -march=zve64*, the ELEN is 64.
+     The related vmv.v.x/vmv.s.x is restricted to ELEN as above, we cannot
+     take care of case like below when ELEN=32
+     vsetvil e64,m1
+     vmv.v.x/vmv.s.x
+   */
+  unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
 
-  gcc_assert (BITS_PER_WORD % npatterns () == 0);
+  gcc_assert (elen % npatterns () == 0);
 
-  int limit = BITS_PER_WORD / npatterns ();
+  int limit = elen / npatterns ();
 
   for (int i = 0; i < limit; i++)
     mask |= base_mask << (i * npatterns ());
@@ -1928,7 +1937,7 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       rtx mask = gen_reg_rtx (mask_mode);
       rtx dup = gen_reg_rtx (dup_mode);
 
-      if (full_nelts <= BITS_PER_WORD) /* vmv.s.x.  */
+      if (full_nelts <= builder.inner_bits_size ()) /* vmv.s.x.  */
 	{
 	  rtx ops[] = {dup, gen_scalar_move_mask (dup_mask_mode),
 	    RVV_VUNDEF (dup_mode), merge_mask};
@@ -1938,7 +1947,8 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       else /* vmv.v.x.  */
 	{
 	  rtx ops[] = {dup, force_reg (GET_MODE_INNER (dup_mode), merge_mask)};
-	  rtx vl = gen_int_mode (CEIL (full_nelts, BITS_PER_WORD), Pmode);
+	  rtx vl = gen_int_mode (CEIL (full_nelts, builder.inner_bits_size ()),
+				 Pmode);
 	  emit_nonvlmax_integer_move_insn (code_for_pred_broadcast (dup_mode),
 					   ops, vl);
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  1:07 ` juzhe.zhong
@ 2023-06-14  7:30   ` Li, Pan2
  0 siblings, 0 replies; 11+ messages in thread
From: Li, Pan2 @ 2023-06-14  7:30 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, Wang, Yanzhang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 3897 bytes --]

Thanks Juzhe for reviewing, update the PATCH v2 as below.

https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621645.html

Pan

From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: Wednesday, June 14, 2023 9:07 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Robin Dapp <rdapp.gcc@gmail.com>; jeffreyalaw <jeffreyalaw@gmail.com>; Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com>
Subject: Re: [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32


>> unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
Add comment here to demonstrate why you pick up elen to set the LIMIT.
I understand:
1. -march=zve32* ===> ELEN = 32
    -march=zve64* ===> ELEN = 64
2. both vmv.v.x/vmv.s.x is restrict to the ELEN
    For example, When ELEN=32 (-march=zve32*)
    vsetvli ...e64,m1
    vmv.v.x/vmv.s.x
    We can't support such code sequence.

You should demonstrate it clearly in the comments.

Otherwise, this patch LGTM.
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>

From: pan2.li<mailto:pan2.li@intel.com>
Date: 2023-06-14 08:58
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zhong@rivai.ai>; rdapp.gcc<mailto:rdapp.gcc@gmail.com>; jeffreyalaw<mailto:jeffreyalaw@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; yanzhang.wang<mailto:yanzhang.wang@intel.com>; kito.cheng<mailto:kito.cheng@gmail.com>
Subject: [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
From: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>

This patch would like to fix one bug exported by RV32 test case
multiple_rgroup_run-2.c. The mask should be restricted by elen in
vector, and the condition between the vmv.s.x and the vmv.v.x should
take inner_bits_size rather than constants.

Passed both the rv32 and rv64 riscv/rvv tests.

Signed-off-by: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>

gcc/ChangeLog:

* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
Take elen instead of scalar BITS_PER_WORD.
(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
instead of scaler BITS_PER_WORD.
---
gcc/config/riscv/riscv-v.cc | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index fb970344521..9270e258ca3 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -399,10 +399,11 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const
{
   unsigned HOST_WIDE_INT mask = 0;
   unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern);
+  unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
-  gcc_assert (BITS_PER_WORD % npatterns () == 0);
+  gcc_assert (elen % npatterns () == 0);
-  int limit = BITS_PER_WORD / npatterns ();
+  int limit = elen / npatterns ();
   for (int i = 0; i < limit; i++)
     mask |= base_mask << (i * npatterns ());
@@ -1923,7 +1924,7 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       rtx mask = gen_reg_rtx (mask_mode);
       rtx dup = gen_reg_rtx (dup_mode);
-      if (full_nelts <= BITS_PER_WORD) /* vmv.s.x.  */
+      if (full_nelts <= builder.inner_bits_size ()) /* vmv.s.x.  */
{
  rtx ops[] = {dup, gen_scalar_move_mask (dup_mask_mode),
    RVV_VUNDEF (dup_mode), merge_mask};
@@ -1933,7 +1934,8 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       else /* vmv.v.x.  */
{
  rtx ops[] = {dup, force_reg (GET_MODE_INNER (dup_mode), merge_mask)};
-   rtx vl = gen_int_mode (CEIL (full_nelts, BITS_PER_WORD), Pmode);
+   rtx vl = gen_int_mode (CEIL (full_nelts, builder.inner_bits_size ()),
+ Pmode);
  emit_nonvlmax_integer_move_insn (code_for_pred_broadcast (dup_mode),
   ops, vl);
}
--
2.34.1



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  7:29 ` [PATCH v2] " pan2.li
@ 2023-06-14  7:43   ` juzhe.zhong
  2023-06-14  8:27   ` Robin Dapp
  1 sibling, 0 replies; 11+ messages in thread
From: juzhe.zhong @ 2023-06-14  7:43 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 4011 bytes --]

+  /* We restrict the limit to the elen of RVV. For example:
+     -march=zve32*, the ELEN is 32.
+     -march=zve64*, the ELEN is 64.
+     The related vmv.v.x/vmv.s.x is restricted to ELEN as above, we cannot
+     take care of case like below when ELEN=32
+     vsetvil e64,m1
+     vmv.v.x/vmv.s.x
+   */

The comment is not clear enough.

How about:

According to RVV ISA SPEC, ELEN = 32 when -march=zve32* and ELEN = 64 when -march=zve64*.
Since vmv.v.x/vmv.s.x can't broadcast/move 64-bit value to the vector when ELEN = 32, we restrict the LIMIT to the ELEN.

I am not the native English speaker, I'd like to see Jeff or Robin comments that.

Thanks. 


juzhe.zhong@rivai.ai
 
From: pan2.li
Date: 2023-06-14 15:29
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v2] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
From: Pan Li <pan2.li@intel.com>
 
This patch would like to fix one bug exported by RV32 test case
multiple_rgroup_run-2.c. The mask should be restricted by elen in
vector, and the condition between the vmv.s.x and the vmv.v.x should
take inner_bits_size rather than constants.
 
After this patch, below failures on RV32 will be fixed.
 
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/repeat_run-3.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
 
Signed-off-by: Pan Li <pan2.li@intel.com>
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
Take elen instead of scalar BITS_PER_WORD.
(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
instead of scaler BITS_PER_WORD.
---
gcc/config/riscv/riscv-v.cc | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index e07d5c2901a..db1a5529419 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -399,10 +399,19 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const
{
   unsigned HOST_WIDE_INT mask = 0;
   unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern);
+  /* We restrict the limit to the elen of RVV. For example:
+     -march=zve32*, the ELEN is 32.
+     -march=zve64*, the ELEN is 64.
+     The related vmv.v.x/vmv.s.x is restricted to ELEN as above, we cannot
+     take care of case like below when ELEN=32
+     vsetvil e64,m1
+     vmv.v.x/vmv.s.x
+   */
+  unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
-  gcc_assert (BITS_PER_WORD % npatterns () == 0);
+  gcc_assert (elen % npatterns () == 0);
-  int limit = BITS_PER_WORD / npatterns ();
+  int limit = elen / npatterns ();
   for (int i = 0; i < limit; i++)
     mask |= base_mask << (i * npatterns ());
@@ -1928,7 +1937,7 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       rtx mask = gen_reg_rtx (mask_mode);
       rtx dup = gen_reg_rtx (dup_mode);
-      if (full_nelts <= BITS_PER_WORD) /* vmv.s.x.  */
+      if (full_nelts <= builder.inner_bits_size ()) /* vmv.s.x.  */
{
  rtx ops[] = {dup, gen_scalar_move_mask (dup_mask_mode),
    RVV_VUNDEF (dup_mode), merge_mask};
@@ -1938,7 +1947,8 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       else /* vmv.v.x.  */
{
  rtx ops[] = {dup, force_reg (GET_MODE_INNER (dup_mode), merge_mask)};
-   rtx vl = gen_int_mode (CEIL (full_nelts, BITS_PER_WORD), Pmode);
+   rtx vl = gen_int_mode (CEIL (full_nelts, builder.inner_bits_size ()),
+ Pmode);
  emit_nonvlmax_integer_move_insn (code_for_pred_broadcast (dup_mode),
   ops, vl);
}
-- 
2.34.1
 
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  7:29 ` [PATCH v2] " pan2.li
  2023-06-14  7:43   ` juzhe.zhong
@ 2023-06-14  8:27   ` Robin Dapp
  2023-06-14  8:34     ` Li, Pan2
  1 sibling, 1 reply; 11+ messages in thread
From: Robin Dapp @ 2023-06-14  8:27 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: rdapp.gcc, juzhe.zhong, jeffreyalaw, yanzhang.wang, kito.cheng

Hi Pan,

> This patch would like to fix one bug exported by RV32 test case
> multiple_rgroup_run-2.c. The mask should be restricted by elen in
> vector, and the condition between the vmv.s.x and the vmv.v.x should
> take inner_bits_size rather than constants.

exported -> exposed.

How about something like:

"When constructing a vector mask from individual elements we wrongly
assumed that we can broadcast BITS_PER_WORD (i.e. XLEN).  The maximum
is actually the vector element length (i.e. ELEN).  This patch fixes
this."?

> +  /* We restrict the limit to the elen of RVV. For example:
> +     -march=zve32*, the ELEN is 32.
> +     -march=zve64*, the ELEN is 64.
> +     The related vmv.v.x/vmv.s.x is restricted to ELEN as above, we cannot
> +     take care of case like below when ELEN=32
> +     vsetvil e64,m1
> +     vmv.v.x/vmv.s.x
> +   */

/* Here we construct a mask pattern that will later be broadcast
   to a vector register.  The maximum broadcast size for vmv.v.x/vmv.s.x
   is determined by the length of a vector element (ELEN) and not by
   XLEN so make sure we do not exceed it.  One example is -march=zve32*
   which mandates ELEN == 32 but can be combined with -march=rv64
   with XLEN == 64.  */

Regards
 Robin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v2] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  8:27   ` Robin Dapp
@ 2023-06-14  8:34     ` Li, Pan2
  0 siblings, 0 replies; 11+ messages in thread
From: Li, Pan2 @ 2023-06-14  8:34 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches
  Cc: juzhe.zhong, jeffreyalaw, Wang, Yanzhang, kito.cheng

Thanks Robin, that looks like much better than the v2, let me update it to PATCH v3.

Pan

-----Original Message-----
From: Robin Dapp <rdapp.gcc@gmail.com> 
Sent: Wednesday, June 14, 2023 4:27 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: rdapp.gcc@gmail.com; juzhe.zhong@rivai.ai; jeffreyalaw@gmail.com; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v2] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32

Hi Pan,

> This patch would like to fix one bug exported by RV32 test case
> multiple_rgroup_run-2.c. The mask should be restricted by elen in
> vector, and the condition between the vmv.s.x and the vmv.v.x should
> take inner_bits_size rather than constants.

exported -> exposed.

How about something like:

"When constructing a vector mask from individual elements we wrongly
assumed that we can broadcast BITS_PER_WORD (i.e. XLEN).  The maximum
is actually the vector element length (i.e. ELEN).  This patch fixes
this."?

> +  /* We restrict the limit to the elen of RVV. For example:
> +     -march=zve32*, the ELEN is 32.
> +     -march=zve64*, the ELEN is 64.
> +     The related vmv.v.x/vmv.s.x is restricted to ELEN as above, we cannot
> +     take care of case like below when ELEN=32
> +     vsetvil e64,m1
> +     vmv.v.x/vmv.s.x
> +   */

/* Here we construct a mask pattern that will later be broadcast
   to a vector register.  The maximum broadcast size for vmv.v.x/vmv.s.x
   is determined by the length of a vector element (ELEN) and not by
   XLEN so make sure we do not exceed it.  One example is -march=zve32*
   which mandates ELEN == 32 but can be combined with -march=rv64
   with XLEN == 64.  */

Regards
 Robin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  0:58 [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32 pan2.li
  2023-06-14  1:07 ` juzhe.zhong
  2023-06-14  7:29 ` [PATCH v2] " pan2.li
@ 2023-06-14  9:00 ` pan2.li
  2023-06-14  9:01   ` juzhe.zhong
  2 siblings, 1 reply; 11+ messages in thread
From: pan2.li @ 2023-06-14  9:00 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, rdapp.gcc, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

From: Pan Li <pan2.li@intel.com>

When constructing a vector mask from individual elements we wrongly
assumed that we can broadcast BITS_PER_WORD (i.e. XLEN).  The maximum is
actually the vector element length (i.e. ELEN).  This patch fixes this.

After this patch, below failures on RV32 will be fixed.

FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/repeat_run-3.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
	Take elen instead of scalar BITS_PER_WORD.
	(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
	instead of scaler BITS_PER_WORD.
---
 gcc/config/riscv/riscv-v.cc | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index e07d5c2901a..01f647bc0bd 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -399,10 +399,17 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const
 {
   unsigned HOST_WIDE_INT mask = 0;
   unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern);
+  /* Here we construct a mask pattern that will later be broadcast
+     to a vector register.  The maximum broadcast size for vmv.v.x/vmv.s.x
+     is determined by the length of a vector element (ELEN) and not by
+     XLEN so make sure we do not exceed it.  One example is -march=zve32*
+     which mandates ELEN == 32 but can be combined with -march=rv64
+     with XLEN == 64.  */
+  unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
 
-  gcc_assert (BITS_PER_WORD % npatterns () == 0);
+  gcc_assert (elen % npatterns () == 0);
 
-  int limit = BITS_PER_WORD / npatterns ();
+  int limit = elen / npatterns ();
 
   for (int i = 0; i < limit; i++)
     mask |= base_mask << (i * npatterns ());
@@ -1928,7 +1935,7 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       rtx mask = gen_reg_rtx (mask_mode);
       rtx dup = gen_reg_rtx (dup_mode);
 
-      if (full_nelts <= BITS_PER_WORD) /* vmv.s.x.  */
+      if (full_nelts <= builder.inner_bits_size ()) /* vmv.s.x.  */
 	{
 	  rtx ops[] = {dup, gen_scalar_move_mask (dup_mask_mode),
 	    RVV_VUNDEF (dup_mode), merge_mask};
@@ -1938,7 +1945,8 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       else /* vmv.v.x.  */
 	{
 	  rtx ops[] = {dup, force_reg (GET_MODE_INNER (dup_mode), merge_mask)};
-	  rtx vl = gen_int_mode (CEIL (full_nelts, BITS_PER_WORD), Pmode);
+	  rtx vl = gen_int_mode (CEIL (full_nelts, builder.inner_bits_size ()),
+				 Pmode);
 	  emit_nonvlmax_integer_move_insn (code_for_pred_broadcast (dup_mode),
 					   ops, vl);
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  9:00 ` [PATCH v3] " pan2.li
@ 2023-06-14  9:01   ` juzhe.zhong
  2023-06-14 18:56     ` Jeff Law
  0 siblings, 1 reply; 11+ messages in thread
From: juzhe.zhong @ 2023-06-14  9:01 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 3383 bytes --]

LGTM



juzhe.zhong@rivai.ai
 
From: pan2.li
Date: 2023-06-14 17:00
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v3] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
From: Pan Li <pan2.li@intel.com>
 
When constructing a vector mask from individual elements we wrongly
assumed that we can broadcast BITS_PER_WORD (i.e. XLEN).  The maximum is
actually the vector element length (i.e. ELEN).  This patch fixes this.
 
After this patch, below failures on RV32 will be fixed.
 
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/repeat_run-3.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
 
Signed-off-by: Pan Li <pan2.li@intel.com>
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
Take elen instead of scalar BITS_PER_WORD.
(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
instead of scaler BITS_PER_WORD.
---
gcc/config/riscv/riscv-v.cc | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index e07d5c2901a..01f647bc0bd 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -399,10 +399,17 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const
{
   unsigned HOST_WIDE_INT mask = 0;
   unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern);
+  /* Here we construct a mask pattern that will later be broadcast
+     to a vector register.  The maximum broadcast size for vmv.v.x/vmv.s.x
+     is determined by the length of a vector element (ELEN) and not by
+     XLEN so make sure we do not exceed it.  One example is -march=zve32*
+     which mandates ELEN == 32 but can be combined with -march=rv64
+     with XLEN == 64.  */
+  unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
-  gcc_assert (BITS_PER_WORD % npatterns () == 0);
+  gcc_assert (elen % npatterns () == 0);
-  int limit = BITS_PER_WORD / npatterns ();
+  int limit = elen / npatterns ();
   for (int i = 0; i < limit; i++)
     mask |= base_mask << (i * npatterns ());
@@ -1928,7 +1935,7 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       rtx mask = gen_reg_rtx (mask_mode);
       rtx dup = gen_reg_rtx (dup_mode);
-      if (full_nelts <= BITS_PER_WORD) /* vmv.s.x.  */
+      if (full_nelts <= builder.inner_bits_size ()) /* vmv.s.x.  */
{
  rtx ops[] = {dup, gen_scalar_move_mask (dup_mask_mode),
    RVV_VUNDEF (dup_mode), merge_mask};
@@ -1938,7 +1945,8 @@ expand_vector_init_merge_repeating_sequence (rtx target,
       else /* vmv.v.x.  */
{
  rtx ops[] = {dup, force_reg (GET_MODE_INNER (dup_mode), merge_mask)};
-   rtx vl = gen_int_mode (CEIL (full_nelts, BITS_PER_WORD), Pmode);
+   rtx vl = gen_int_mode (CEIL (full_nelts, builder.inner_bits_size ()),
+ Pmode);
  emit_nonvlmax_integer_move_insn (code_for_pred_broadcast (dup_mode),
   ops, vl);
}
-- 
2.34.1
 
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14  9:01   ` juzhe.zhong
@ 2023-06-14 18:56     ` Jeff Law
  2023-06-15  1:05       ` Li, Pan2
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Law @ 2023-06-14 18:56 UTC (permalink / raw)
  To: juzhe.zhong, pan2.li, gcc-patches; +Cc: Robin Dapp, yanzhang.wang, kito.cheng



On 6/14/23 03:01, juzhe.zhong@rivai.ai wrote:
> LGTM
Agreed.  Commit when convenient.

jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32
  2023-06-14 18:56     ` Jeff Law
@ 2023-06-15  1:05       ` Li, Pan2
  0 siblings, 0 replies; 11+ messages in thread
From: Li, Pan2 @ 2023-06-15  1:05 UTC (permalink / raw)
  To: Jeff Law, juzhe.zhong, gcc-patches; +Cc: Robin Dapp, Wang, Yanzhang, kito.cheng

Committed, thanks Jeff and Juzhe.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Thursday, June 15, 2023 2:56 AM
To: juzhe.zhong@rivai.ai; Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Robin Dapp <rdapp.gcc@gmail.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com>
Subject: Re: [PATCH v3] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32



On 6/14/23 03:01, juzhe.zhong@rivai.ai wrote:
> LGTM
Agreed.  Commit when convenient.

jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-06-15  1:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-14  0:58 [PATCH v1] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32 pan2.li
2023-06-14  1:07 ` juzhe.zhong
2023-06-14  7:30   ` Li, Pan2
2023-06-14  7:29 ` [PATCH v2] " pan2.li
2023-06-14  7:43   ` juzhe.zhong
2023-06-14  8:27   ` Robin Dapp
2023-06-14  8:34     ` Li, Pan2
2023-06-14  9:00 ` [PATCH v3] " pan2.li
2023-06-14  9:01   ` juzhe.zhong
2023-06-14 18:56     ` Jeff Law
2023-06-15  1:05       ` Li, Pan2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).