* [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
@ 2024-03-28 10:31 demin.han
2024-03-28 10:44 ` juzhe.zhong
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: demin.han @ 2024-03-28 10:31 UTC (permalink / raw)
To: gcc-patches; +Cc: juzhe.zhong, kito.cheng, pan2.li, jeffreyalaw, rdapp.gcc
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in spill.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com>
---
gcc/config/riscv/riscv-vector-costs.cc | 25 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++
2 files changed, 39 insertions(+), 9 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..9f7fe936a29 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
@@ -752,7 +759,7 @@ update_local_live_ranges (
We will be likely using one more vector variable. */
unsigned int max_point
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
auto *live_ranges = live_ranges_per_bb.get (bb);
bool existed_p = false;
tree var = type == load_vec_info_type
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-03-28 10:31 [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model demin.han
@ 2024-03-28 10:44 ` juzhe.zhong
2024-03-28 11:06 ` Demin Han
2024-03-28 14:37 ` Jeff Law
2024-04-02 8:30 ` [PATCH v2] " demin.han
2 siblings, 1 reply; 10+ messages in thread
From: juzhe.zhong @ 2024-03-28 10:44 UTC (permalink / raw)
To: demin.han, gcc-patches; +Cc: kito.cheng, pan2.li, jeffreyalaw, Robin Dapp
[-- Attachment #1: Type: text/plain, Size: 4626 bytes --]
Thanks a lot for trying to optimize the dynamic LMUL cost model.
The need_additional_vector_vars_p looks good to me.
But
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
I wonder why you remove - 1?
juzhe.zhong@rivai.ai
From: demin.han
Date: 2024-03-28 18:31
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc
Subject: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in spill.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com>
---
gcc/config/riscv/riscv-vector-costs.cc | 25 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++
2 files changed, 39 insertions(+), 9 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..9f7fe936a29 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
@@ -752,7 +759,7 @@ update_local_live_ranges (
We will be likely using one more vector variable. */
unsigned int max_point
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
auto *live_ranges = live_ranges_per_bb.get (bb);
bool existed_p = false;
tree var = type == load_vec_info_type
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-03-28 10:44 ` juzhe.zhong
@ 2024-03-28 11:06 ` Demin Han
2024-03-28 11:10 ` 回复: " juzhe.zhong
0 siblings, 1 reply; 10+ messages in thread
From: Demin Han @ 2024-03-28 11:06 UTC (permalink / raw)
To: juzhe.zhong, gcc-patches; +Cc: kito.cheng, pan2.li, jeffreyalaw, Robin Dapp
[-- Attachment #1: Type: text/plain, Size: 5465 bytes --]
Hi,
the point starts from 1. the max_point should equal to length();
Should I prepare an individual patch for this?
From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: 2024年3月28日 18:45
To: Demin Han <demin.han@starfivetech.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>; pan2.li <pan2.li@intel.com>; jeffreyalaw <jeffreyalaw@gmail.com>; Robin Dapp <rdapp.gcc@gmail.com>
Subject: Re: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
Thanks a lot for trying to optimize the dynamic LMUL cost model.
The need_additional_vector_vars_p looks good to me.
But
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
I wonder why you remove - 1?
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>
From: demin.han<mailto:demin.han@starfivetech.com>
Date: 2024-03-28 18:31
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zhong@rivai.ai>; kito.cheng<mailto:kito.cheng@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; jeffreyalaw<mailto:jeffreyalaw@gmail.com>; rdapp.gcc<mailto:rdapp.gcc@gmail.com>
Subject: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in spill.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com<mailto:demin.han@starfivetech.com>>
---
gcc/config/riscv/riscv-vector-costs.cc | 25 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++
2 files changed, 39 insertions(+), 9 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..9f7fe936a29 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
@@ -752,7 +759,7 @@ update_local_live_ranges (
We will be likely using one more vector variable. */
unsigned int max_point
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
auto *live_ranges = live_ranges_per_bb.get (bb);
bool existed_p = false;
tree var = type == load_vec_info_type
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* 回复: RE: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-03-28 11:06 ` Demin Han
@ 2024-03-28 11:10 ` juzhe.zhong
2024-03-28 11:25 ` Demin Han
0 siblings, 1 reply; 10+ messages in thread
From: juzhe.zhong @ 2024-03-28 11:10 UTC (permalink / raw)
To: demin.han, gcc-patches; +Cc: kito.cheng, pan2.li, jeffreyalaw, Robin Dapp
[-- Attachment #1: Type: text/plain, Size: 5562 bytes --]
OK. It's an obvious fix but it seems to be unrelated to the PR.
Could you split it 2 separate patches ?
Thanks.
juzhe.zhong@rivai.ai
发件人: Demin Han
发送时间: 2024-03-28 19:06
收件人: juzhe.zhong@rivai.ai; gcc-patches
抄送: kito.cheng; pan2.li; jeffreyalaw; Robin Dapp
主题: RE: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
Hi,
the point starts from 1. the max_point should equal to length();
Should I prepare an individual patch for this?
From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: 2024年3月28日 18:45
To: Demin Han <demin.han@starfivetech.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>; pan2.li <pan2.li@intel.com>; jeffreyalaw <jeffreyalaw@gmail.com>; Robin Dapp <rdapp.gcc@gmail.com>
Subject: Re: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
Thanks a lot for trying to optimize the dynamic LMUL cost model.
The need_additional_vector_vars_p looks good to me.
But
- = (*program_points_per_bb.get (bb)).length () - 1;+ = (*program_points_per_bb.get (bb)).length ();
I wonder why you remove - 1?
juzhe.zhong@rivai.ai
From: demin.han
Date: 2024-03-28 18:31
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc
Subject: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in spill.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com>
---
gcc/config/riscv/riscv-vector-costs.cc | 25 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++
2 files changed, 39 insertions(+), 9 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..9f7fe936a29 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
@@ -752,7 +759,7 @@ update_local_live_ranges (
We will be likely using one more vector variable. */
unsigned int max_point
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
auto *live_ranges = live_ranges_per_bb.get (bb);
bool existed_p = false;
tree var = type == load_vec_info_type
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: RE: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-03-28 11:10 ` 回复: " juzhe.zhong
@ 2024-03-28 11:25 ` Demin Han
0 siblings, 0 replies; 10+ messages in thread
From: Demin Han @ 2024-03-28 11:25 UTC (permalink / raw)
To: juzhe.zhong, gcc-patches; +Cc: kito.cheng, pan2.li, jeffreyalaw, Robin Dapp
[-- Attachment #1: Type: text/plain, Size: 6797 bytes --]
OK,I will spilt them.
Thanks.
From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: 2024年3月28日 19:11
To: Demin Han <demin.han@starfivetech.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>; pan2.li <pan2.li@intel.com>; jeffreyalaw <jeffreyalaw@gmail.com>; Robin Dapp <rdapp.gcc@gmail.com>
Subject: 回复: RE: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
OK. It's an obvious fix but it seems to be unrelated to the PR.
Could you split it 2 separate patches ?
Thanks.
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>
发件人: Demin Han<mailto:demin.han@starfivetech.com>
发送时间: 2024-03-28 19:06
收件人: juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>; gcc-patches<mailto:gcc-patches@gcc.gnu.org>
抄送: kito.cheng<mailto:kito.cheng@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; jeffreyalaw<mailto:jeffreyalaw@gmail.com>; Robin Dapp<mailto:rdapp.gcc@gmail.com>
主题: RE: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
Hi,
the point starts from 1. the max_point should equal to length();
Should I prepare an individual patch for this?
From: juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai> <juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>>
Sent: 2024年3月28日 18:45
To: Demin Han <demin.han@starfivetech.com<mailto:demin.han@starfivetech.com>>; gcc-patches <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>
Cc: kito.cheng <kito.cheng@gmail.com<mailto:kito.cheng@gmail.com>>; pan2.li <pan2.li@intel.com<mailto:pan2.li@intel.com>>; jeffreyalaw <jeffreyalaw@gmail.com<mailto:jeffreyalaw@gmail.com>>; Robin Dapp <rdapp.gcc@gmail.com<mailto:rdapp.gcc@gmail.com>>
Subject: Re: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
Thanks a lot for trying to optimize the dynamic LMUL cost model.
The need_additional_vector_vars_p looks good to me.
But
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
I wonder why you remove - 1?
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>
From: demin.han<mailto:demin.han@starfivetech.com>
Date: 2024-03-28 18:31
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zhong@rivai.ai>; kito.cheng<mailto:kito.cheng@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; jeffreyalaw<mailto:jeffreyalaw@gmail.com>; rdapp.gcc<mailto:rdapp.gcc@gmail.com>
Subject: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in spill.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com<mailto:demin.han@starfivetech.com>>
---
gcc/config/riscv/riscv-vector-costs.cc | 25 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++
2 files changed, 39 insertions(+), 9 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..9f7fe936a29 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
@@ -752,7 +759,7 @@ update_local_live_ranges (
We will be likely using one more vector variable. */
unsigned int max_point
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
auto *live_ranges = live_ranges_per_bb.get (bb);
bool existed_p = false;
tree var = type == load_vec_info_type
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-03-28 10:31 [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model demin.han
2024-03-28 10:44 ` juzhe.zhong
@ 2024-03-28 14:37 ` Jeff Law
2024-04-02 8:30 ` [PATCH v2] " demin.han
2 siblings, 0 replies; 10+ messages in thread
From: Jeff Law @ 2024-03-28 14:37 UTC (permalink / raw)
To: demin.han, gcc-patches; +Cc: juzhe.zhong, kito.cheng, pan2.li, rdapp.gcc
On 3/28/24 4:31 AM, demin.han wrote:
> The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
> So unnecessary live-ranges are added and result in spill.
>
> This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
> load/store.
>
> Tested on RV64 and no regression.
>
> PR target/114506
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
> (need_additional_vector_vars_p): Rename and refine condition
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Note I think this should defer to gcc-15. It doesn't affect code
correctness AFAICT and it's not a regression relative to gcc-13.
jeff
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-03-28 10:31 [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model demin.han
2024-03-28 10:44 ` juzhe.zhong
2024-03-28 14:37 ` Jeff Law
@ 2024-04-02 8:30 ` demin.han
2024-04-02 8:34 ` juzhe.zhong
2024-04-29 5:10 ` juzhe.zhong
2 siblings, 2 replies; 10+ messages in thread
From: demin.han @ 2024-04-02 8:30 UTC (permalink / raw)
To: gcc-patches; +Cc: juzhe.zhong, kito.cheng, pan2.li, jeffreyalaw, rdapp.gcc
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in smaller LMUL.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com>
---
V2 changes:
1. remove max_point issue
2. minor change in commit message
gcc/config/riscv/riscv-vector-costs.cc | 23 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++++
2 files changed, 38 insertions(+), 8 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..484196b15b4 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-04-02 8:30 ` [PATCH v2] " demin.han
@ 2024-04-02 8:34 ` juzhe.zhong
2024-04-29 5:10 ` juzhe.zhong
1 sibling, 0 replies; 10+ messages in thread
From: juzhe.zhong @ 2024-04-02 8:34 UTC (permalink / raw)
To: demin.han, gcc-patches; +Cc: kito.cheng, pan2.li, jeffreyalaw, Robin Dapp
[-- Attachment #1: Type: text/plain, Size: 4145 bytes --]
Thanks for fixing it. LGTM to GCC-15 as Jeff suggested.
juzhe.zhong@rivai.ai
From: demin.han
Date: 2024-04-02 16:30
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc
Subject: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in smaller LMUL.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com>
---
V2 changes:
1. remove max_point issue
2. minor change in commit message
gcc/config/riscv/riscv-vector-costs.cc | 23 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++++
2 files changed, 38 insertions(+), 8 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..484196b15b4 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-04-02 8:30 ` [PATCH v2] " demin.han
2024-04-02 8:34 ` juzhe.zhong
@ 2024-04-29 5:10 ` juzhe.zhong
2024-04-29 11:26 ` Demin Han
1 sibling, 1 reply; 10+ messages in thread
From: juzhe.zhong @ 2024-04-29 5:10 UTC (permalink / raw)
To: demin.han, gcc-patches; +Cc: kito.cheng, pan2.li, jeffreyalaw, Robin Dapp
[-- Attachment #1: Type: text/plain, Size: 4160 bytes --]
Hi, Han.
GCC 14 is branch out. You can commit it to trunk (GCC 15).
juzhe.zhong@rivai.ai
From: demin.han
Date: 2024-04-02 16:30
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc
Subject: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in smaller LMUL.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com>
---
V2 changes:
1. remove max_point issue
2. minor change in commit message
gcc/config/riscv/riscv-vector-costs.cc | 23 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++++
2 files changed, 38 insertions(+), 8 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..484196b15b4 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
2024-04-29 5:10 ` juzhe.zhong
@ 2024-04-29 11:26 ` Demin Han
0 siblings, 0 replies; 10+ messages in thread
From: Demin Han @ 2024-04-29 11:26 UTC (permalink / raw)
To: juzhe.zhong, gcc-patches; +Cc: kito.cheng, pan2.li, jeffreyalaw, Robin Dapp
[-- Attachment #1: Type: text/plain, Size: 4941 bytes --]
Hi, Juzhe.
Thanks for reminding.
I did regression again and committed.
Regard,
Demin
From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: 2024年4月29日 13:10
To: Demin Han <demin.han@starfivetech.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>; pan2.li <pan2.li@intel.com>; jeffreyalaw <jeffreyalaw@gmail.com>; Robin Dapp <rdapp.gcc@gmail.com>
Subject: Re: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
Hi, Han.
GCC 14 is branch out. You can commit it to trunk (GCC 15).
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>
From: demin.han<mailto:demin.han@starfivetech.com>
Date: 2024-04-02 16:30
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zhong@rivai.ai>; kito.cheng<mailto:kito.cheng@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; jeffreyalaw<mailto:jeffreyalaw@gmail.com>; rdapp.gcc<mailto:rdapp.gcc@gmail.com>
Subject: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in smaller LMUL.
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
Tested on RV64 and no regression.
PR target/114506
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
Signed-off-by: demin.han <demin.han@starfivetech.com<mailto:demin.han@starfivetech.com>>
---
V2 changes:
1. remove max_point issue
2. minor change in commit message
gcc/config/riscv/riscv-vector-costs.cc | 23 ++++++++++++-------
.../vect/costmodel/riscv/rvv/pr114506.c | 23 +++++++++++++++++++
2 files changed, 38 insertions(+), 8 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..484196b15b4 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store. */
+/* Return true if addtional vector vars needed. */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+ if (type == load_vec_info_type || type == store_vec_info_type)
+ {
+ if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+ machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+ int lmul = riscv_get_v_regno_alignment (mode);
+ if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+ }
+ return false;
}
/* Return the LMUL of the current analysis. */
@@ -739,10 +749,7 @@ update_local_live_ranges (
stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
- != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
/* For non-adjacent load/store STMT, we will potentially
convert it into:
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 00000000000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+ for (int i = 0; i < 256; i++)
+ {
+ for (int j = 0; j < 256; j++)
+ {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+ a[i] = b[i] + c[i] * d[i];
+ }
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-04-29 11:26 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-28 10:31 [PATCH] RISC-V: Refine the condition for add additional vars in RVV cost model demin.han
2024-03-28 10:44 ` juzhe.zhong
2024-03-28 11:06 ` Demin Han
2024-03-28 11:10 ` 回复: " juzhe.zhong
2024-03-28 11:25 ` Demin Han
2024-03-28 14:37 ` Jeff Law
2024-04-02 8:30 ` [PATCH v2] " demin.han
2024-04-02 8:34 ` juzhe.zhong
2024-04-29 5:10 ` juzhe.zhong
2024-04-29 11:26 ` Demin Han
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).