public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]
@ 2023-11-02  3:14 pan2.li
  2023-11-02  8:19 ` Richard Biener
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: pan2.li @ 2023-11-02  3:14 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng, jeffreyalaw,
	richard.sandiford

From: Pan Li <pan2.li@intel.com>

The extract_low_bits only try the scalar mode if the bitsize of
the mode and src_mode is not equal. When vector mode is given
from get_stored_val in DSE, it will always fail and return NULL_RTX.

This patch would like to allow the vector mode in the extract_low_bits
if and only if the size of mode is less than or equals to the size of
the src_mode.

Given below example code with --param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
  uint8_t arr[32] = {
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
  };

  return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:

test:
  lui     a5,%hi(.LANCHOR0)
  addi    sp,sp,-32
  addi    a5,a5,%lo(.LANCHOR0)
  li      a3,32
  vl2re64.v       v2,0(a5)
  vsetvli zero,a3,e8,m1,ta,ma
  vs2r.v  v2,0(sp)             <== Unnecessary store to stack
  vle8.v  v1,0(sp)             <== Ditto
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

After this patch:

test:
  lui     a5,%hi(.LANCHOR0)
  addi    a5,a5,%lo(.LANCHOR0)
  li      a4,32
  addi    sp,sp,-32
  vsetvli zero,a4,e8,m1,ta,ma
  vle8.v  v1,0(a5)
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

Below tests are passed within this patch:

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression test.

	PR target/111720

gcc/ChangeLog:

	* expmed.cc (extract_low_bits): Allow vector mode if the
	mode size is less than or equal to src_mode.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/expmed.cc                                 | 44 ++++++++++++-------
 .../gcc.target/riscv/rvv/base/pr111720-0.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-1.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-2.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-3.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-4.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-5.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-6.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-7.c    | 21 +++++++++
 .../gcc.target/riscv/rvv/base/pr111720-8.c    | 18 ++++++++
 .../gcc.target/riscv/rvv/base/pr111720-9.c    | 15 +++++++
 12 files changed, 227 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index b294eabb08d..5db83fe638c 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -2403,8 +2403,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 rtx
 extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
 {
-  scalar_int_mode int_mode, src_int_mode;
-
   if (mode == src_mode)
     return src;
 
@@ -2437,22 +2435,38 @@ extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
         return x;
     }
 
-  if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
-      || !int_mode_for_mode (mode).exists (&int_mode))
-    return NULL_RTX;
+  if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))
+    {
+      if (maybe_gt (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (src_mode))
+	|| !targetm.modes_tieable_p (mode, src_mode))
+	return NULL_RTX;
 
-  if (!targetm.modes_tieable_p (src_int_mode, src_mode))
-    return NULL_RTX;
-  if (!targetm.modes_tieable_p (int_mode, mode))
-    return NULL_RTX;
+      /* For vector mode,  only the bitsize (mode) <= bitsize (src_mode) and
+	 tieable is allowed here.  */
+      src = gen_lowpart (mode, src);
+    }
+  else
+    {
+      scalar_int_mode int_mode, src_int_mode;
 
-  src = gen_lowpart (src_int_mode, src);
-  if (!validate_subreg (int_mode, src_int_mode, src,
-			subreg_lowpart_offset (int_mode, src_int_mode)))
-    return NULL_RTX;
+      if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
+	  || !int_mode_for_mode (mode).exists (&int_mode))
+	return NULL_RTX;
+
+      if (!targetm.modes_tieable_p (src_int_mode, src_mode))
+	return NULL_RTX;
+      if (!targetm.modes_tieable_p (int_mode, mode))
+	return NULL_RTX;
+
+      src = gen_lowpart (src_int_mode, src);
+      if (!validate_subreg (int_mode, src_int_mode, src,
+			    subreg_lowpart_offset (int_mode, src_int_mode)))
+	return NULL_RTX;
+
+      src = convert_modes (int_mode, src_int_mode, src, true);
+      src = gen_lowpart (mode, src);
+    }
 
-  src = convert_modes (int_mode, src_int_mode, src, true);
-  src = gen_lowpart (mode, src);
   return src;
 }
 \f
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
new file mode 100644
index 00000000000..a61e94a6d98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
new file mode 100644
index 00000000000..46efd7379ac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
new file mode 100644
index 00000000000..8bebac219a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool4_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vlm_v_b4(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
new file mode 100644
index 00000000000..47e4243e02e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 16);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
new file mode 100644
index 00000000000..5331e547ed3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 8);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
new file mode 100644
index 00000000000..0c728f93514
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8mf2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8mf2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
new file mode 100644
index 00000000000..ccfc40cd382
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 4);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
new file mode 100644
index 00000000000..ce7ddbb99b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m8(arr, 32);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
new file mode 100644
index 00000000000..ac0100a1211
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  vuint8m1_t varr = __riscv_vle8_v_u8m1(arr, 32);
+  vuint8m1_t vand_m = __riscv_vand_vx_u8m1(varr, 1, 32);
+
+  return __riscv_vreinterpret_v_u8m1_b8(vand_m);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
new file mode 100644
index 00000000000..b7ebef80954
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t test () {
+  float arr[32] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+  };
+
+  return __riscv_vle32_v_f32m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
new file mode 100644
index 00000000000..21fed06d201
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat64m8_t test () {
+  double arr[8] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+  };
+
+  return __riscv_vle64_v_f64m8(arr, 4);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]
  2023-11-02  3:14 [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720] pan2.li
@ 2023-11-02  8:19 ` Richard Biener
  2023-11-02 12:17   ` Li, Pan2
  2023-11-09  6:08 ` [PATCH v2] DSE: Allow vector type for get_stored_val when read < store pan2.li
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2023-11-02  8:19 UTC (permalink / raw)
  To: pan2.li
  Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng, jeffreyalaw,
	richard.sandiford

On Thu, Nov 2, 2023 at 4:15 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> The extract_low_bits only try the scalar mode if the bitsize of
> the mode and src_mode is not equal. When vector mode is given
> from get_stored_val in DSE, it will always fail and return NULL_RTX.
>
> This patch would like to allow the vector mode in the extract_low_bits
> if and only if the size of mode is less than or equals to the size of
> the src_mode.
>
> Given below example code with --param=riscv-autovec-preference=fixed-vlmax.
>
> vuint8m1_t test () {
>   uint8_t arr[32] = {
>     1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>     1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>   };
>
>   return __riscv_vle8_v_u8m1(arr, 32);
> }
>
> Before this patch:
>
> test:
>   lui     a5,%hi(.LANCHOR0)
>   addi    sp,sp,-32
>   addi    a5,a5,%lo(.LANCHOR0)
>   li      a3,32
>   vl2re64.v       v2,0(a5)
>   vsetvli zero,a3,e8,m1,ta,ma
>   vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>   vle8.v  v1,0(sp)             <== Ditto
>   vs1r.v  v1,0(a0)
>   addi    sp,sp,32
>   jr      ra
>
> After this patch:
>
> test:
>   lui     a5,%hi(.LANCHOR0)
>   addi    a5,a5,%lo(.LANCHOR0)
>   li      a4,32
>   addi    sp,sp,-32
>   vsetvli zero,a4,e8,m1,ta,ma
>   vle8.v  v1,0(a5)
>   vs1r.v  v1,0(a0)
>   addi    sp,sp,32
>   jr      ra
>
> Below tests are passed within this patch:
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression test.
>
>         PR target/111720
>
> gcc/ChangeLog:
>
>         * expmed.cc (extract_low_bits): Allow vector mode if the
>         mode size is less than or equal to src_mode.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/expmed.cc                                 | 44 ++++++++++++-------
>  .../gcc.target/riscv/rvv/base/pr111720-0.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-1.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-2.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-3.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-4.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-5.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-6.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-7.c    | 21 +++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-8.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-9.c    | 15 +++++++
>  12 files changed, 227 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
>
> diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> index b294eabb08d..5db83fe638c 100644
> --- a/gcc/expmed.cc
> +++ b/gcc/expmed.cc
> @@ -2403,8 +2403,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
>  rtx
>  extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
>  {
> -  scalar_int_mode int_mode, src_int_mode;
> -
>    if (mode == src_mode)
>      return src;
>
> @@ -2437,22 +2435,38 @@ extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
>          return x;
>      }
>
> -  if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
> -      || !int_mode_for_mode (mode).exists (&int_mode))
> -    return NULL_RTX;
> +  if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))

when there are integer modes for the vector modes you now go a different path,
a little less "regressing" would be to write it as

   if (int_mode_for_mode (src_mode).exists (&src_int_mode)
       && int_mode_for_mode (mode).exists (&int_mode))
     {
        ... old code ...
     }
  else if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))
     {
        ... new code ...
     }
  else
     return NULL_RTX;

> +    {
> +      if (maybe_gt (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (src_mode))
> +       || !targetm.modes_tieable_p (mode, src_mode))
> +       return NULL_RTX;
>
> -  if (!targetm.modes_tieable_p (src_int_mode, src_mode))
> -    return NULL_RTX;
> -  if (!targetm.modes_tieable_p (int_mode, mode))
> -    return NULL_RTX;
> +      /* For vector mode,  only the bitsize (mode) <= bitsize (src_mode) and
> +        tieable is allowed here.  */
> +      src = gen_lowpart (mode, src);

so you're really expecting to generate a subreg here?  Given "vector
register layout"
isn't something that's very well defined I fear it's going to be
difficult to guarantee
the desired semantics of this function.  IIRC powerpc64le has big-endian lane
order for example.

> +    }
> +  else
> +    {
> +      scalar_int_mode int_mode, src_int_mode;
>
> -  src = gen_lowpart (src_int_mode, src);
> -  if (!validate_subreg (int_mode, src_int_mode, src,
> -                       subreg_lowpart_offset (int_mode, src_int_mode)))
> -    return NULL_RTX;
> +      if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
> +         || !int_mode_for_mode (mode).exists (&int_mode))
> +       return NULL_RTX;
> +
> +      if (!targetm.modes_tieable_p (src_int_mode, src_mode))
> +       return NULL_RTX;
> +      if (!targetm.modes_tieable_p (int_mode, mode))
> +       return NULL_RTX;
> +
> +      src = gen_lowpart (src_int_mode, src);
> +      if (!validate_subreg (int_mode, src_int_mode, src,
> +                           subreg_lowpart_offset (int_mode, src_int_mode)))
> +       return NULL_RTX;
> +
> +      src = convert_modes (int_mode, src_int_mode, src, true);
> +      src = gen_lowpart (mode, src);
> +    }
>
> -  src = convert_modes (int_mode, src_int_mode, src, true);
> -  src = gen_lowpart (mode, src);
>    return src;
>  }
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
> new file mode 100644
> index 00000000000..a61e94a6d98
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m1_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m1(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
> new file mode 100644
> index 00000000000..46efd7379ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m2(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
> new file mode 100644
> index 00000000000..8bebac219a6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vbool4_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vlm_v_b4(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
> new file mode 100644
> index 00000000000..47e4243e02e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m1_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m1(arr, 16);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
> new file mode 100644
> index 00000000000..5331e547ed3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m2(arr, 8);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
> new file mode 100644
> index 00000000000..0c728f93514
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8mf2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8mf2(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
> new file mode 100644
> index 00000000000..ccfc40cd382
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m2(arr, 4);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
> new file mode 100644
> index 00000000000..ce7ddbb99b2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m8_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m8(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> +/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
> new file mode 100644
> index 00000000000..ac0100a1211
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vbool8_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  vuint8m1_t varr = __riscv_vle8_v_u8m1(arr, 32);
> +  vuint8m1_t vand_m = __riscv_vand_vx_u8m1(varr, 1, 32);
> +
> +  return __riscv_vreinterpret_v_u8m1_b8(vand_m);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
> new file mode 100644
> index 00000000000..b7ebef80954
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat32m1_t test () {
> +  float arr[32] = {
> +    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
> +    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
> +    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
> +    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
> +  };
> +
> +  return __riscv_vle32_v_f32m1(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
> new file mode 100644
> index 00000000000..21fed06d201
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat64m8_t test () {
> +  double arr[8] = {
> +    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
> +  };
> +
> +  return __riscv_vle64_v_f64m8(arr, 4);
> +}
> +
> +/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> +/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]
  2023-11-02  8:19 ` Richard Biener
@ 2023-11-02 12:17   ` Li, Pan2
  0 siblings, 0 replies; 18+ messages in thread
From: Li, Pan2 @ 2023-11-02 12:17 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng,
	jeffreyalaw, richard.sandiford

Thanks Richard B for comments.

> when there are integer modes for the vector modes you now go a different path,
> a little less "regressing" would be to write it as
> 
>   if (int_mode_for_mode (src_mode).exists (&src_int_mode)
>        && int_mode_for_mode (mode).exists (&int_mode))
>      {
>         ... old code ...
>      }
>   else if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))
>      {
>         ... new code ...
>    }
>   else
>      return NULL_RTX;

That make sense to me, will update it in V2.

> so you're really expecting to generate a subreg here?  Given "vector
> register layout"
> isn't something that's very well defined I fear it's going to be
> difficult to guarantee
> the desired semantics of this function.  IIRC powerpc64le has big-endian lane
> order for example.

This should be one problem here, I may need more consideration here regarding different backends.

Pan


-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Thursday, November 2, 2023 4:20 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com; richard.sandiford@arm.com
Subject: Re: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]

On Thu, Nov 2, 2023 at 4:15 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> The extract_low_bits only try the scalar mode if the bitsize of
> the mode and src_mode is not equal. When vector mode is given
> from get_stored_val in DSE, it will always fail and return NULL_RTX.
>
> This patch would like to allow the vector mode in the extract_low_bits
> if and only if the size of mode is less than or equals to the size of
> the src_mode.
>
> Given below example code with --param=riscv-autovec-preference=fixed-vlmax.
>
> vuint8m1_t test () {
>   uint8_t arr[32] = {
>     1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>     1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>   };
>
>   return __riscv_vle8_v_u8m1(arr, 32);
> }
>
> Before this patch:
>
> test:
>   lui     a5,%hi(.LANCHOR0)
>   addi    sp,sp,-32
>   addi    a5,a5,%lo(.LANCHOR0)
>   li      a3,32
>   vl2re64.v       v2,0(a5)
>   vsetvli zero,a3,e8,m1,ta,ma
>   vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>   vle8.v  v1,0(sp)             <== Ditto
>   vs1r.v  v1,0(a0)
>   addi    sp,sp,32
>   jr      ra
>
> After this patch:
>
> test:
>   lui     a5,%hi(.LANCHOR0)
>   addi    a5,a5,%lo(.LANCHOR0)
>   li      a4,32
>   addi    sp,sp,-32
>   vsetvli zero,a4,e8,m1,ta,ma
>   vle8.v  v1,0(a5)
>   vs1r.v  v1,0(a0)
>   addi    sp,sp,32
>   jr      ra
>
> Below tests are passed within this patch:
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression test.
>
>         PR target/111720
>
> gcc/ChangeLog:
>
>         * expmed.cc (extract_low_bits): Allow vector mode if the
>         mode size is less than or equal to src_mode.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>         * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/expmed.cc                                 | 44 ++++++++++++-------
>  .../gcc.target/riscv/rvv/base/pr111720-0.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-1.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-2.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-3.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-4.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-5.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-6.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-7.c    | 21 +++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-8.c    | 18 ++++++++
>  .../gcc.target/riscv/rvv/base/pr111720-9.c    | 15 +++++++
>  12 files changed, 227 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
>
> diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> index b294eabb08d..5db83fe638c 100644
> --- a/gcc/expmed.cc
> +++ b/gcc/expmed.cc
> @@ -2403,8 +2403,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
>  rtx
>  extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
>  {
> -  scalar_int_mode int_mode, src_int_mode;
> -
>    if (mode == src_mode)
>      return src;
>
> @@ -2437,22 +2435,38 @@ extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
>          return x;
>      }
>
> -  if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
> -      || !int_mode_for_mode (mode).exists (&int_mode))
> -    return NULL_RTX;
> +  if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))

when there are integer modes for the vector modes you now go a different path,
a little less "regressing" would be to write it as

   if (int_mode_for_mode (src_mode).exists (&src_int_mode)
       && int_mode_for_mode (mode).exists (&int_mode))
     {
        ... old code ...
     }
  else if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))
     {
        ... new code ...
     }
  else
     return NULL_RTX;

> +    {
> +      if (maybe_gt (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (src_mode))
> +       || !targetm.modes_tieable_p (mode, src_mode))
> +       return NULL_RTX;
>
> -  if (!targetm.modes_tieable_p (src_int_mode, src_mode))
> -    return NULL_RTX;
> -  if (!targetm.modes_tieable_p (int_mode, mode))
> -    return NULL_RTX;
> +      /* For vector mode,  only the bitsize (mode) <= bitsize (src_mode) and
> +        tieable is allowed here.  */
> +      src = gen_lowpart (mode, src);

so you're really expecting to generate a subreg here?  Given "vector
register layout"
isn't something that's very well defined I fear it's going to be
difficult to guarantee
the desired semantics of this function.  IIRC powerpc64le has big-endian lane
order for example.

> +    }
> +  else
> +    {
> +      scalar_int_mode int_mode, src_int_mode;
>
> -  src = gen_lowpart (src_int_mode, src);
> -  if (!validate_subreg (int_mode, src_int_mode, src,
> -                       subreg_lowpart_offset (int_mode, src_int_mode)))
> -    return NULL_RTX;
> +      if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
> +         || !int_mode_for_mode (mode).exists (&int_mode))
> +       return NULL_RTX;
> +
> +      if (!targetm.modes_tieable_p (src_int_mode, src_mode))
> +       return NULL_RTX;
> +      if (!targetm.modes_tieable_p (int_mode, mode))
> +       return NULL_RTX;
> +
> +      src = gen_lowpart (src_int_mode, src);
> +      if (!validate_subreg (int_mode, src_int_mode, src,
> +                           subreg_lowpart_offset (int_mode, src_int_mode)))
> +       return NULL_RTX;
> +
> +      src = convert_modes (int_mode, src_int_mode, src, true);
> +      src = gen_lowpart (mode, src);
> +    }
>
> -  src = convert_modes (int_mode, src_int_mode, src, true);
> -  src = gen_lowpart (mode, src);
>    return src;
>  }
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
> new file mode 100644
> index 00000000000..a61e94a6d98
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m1_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m1(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
> new file mode 100644
> index 00000000000..46efd7379ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m2(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
> new file mode 100644
> index 00000000000..8bebac219a6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vbool4_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vlm_v_b4(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
> new file mode 100644
> index 00000000000..47e4243e02e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m1_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m1(arr, 16);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
> new file mode 100644
> index 00000000000..5331e547ed3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m2(arr, 8);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
> new file mode 100644
> index 00000000000..0c728f93514
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8mf2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8mf2(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
> new file mode 100644
> index 00000000000..ccfc40cd382
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m2_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m2(arr, 4);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
> new file mode 100644
> index 00000000000..ce7ddbb99b2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vuint8m8_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  return __riscv_vle8_v_u8m8(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> +/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
> new file mode 100644
> index 00000000000..ac0100a1211
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vbool8_t test () {
> +  uint8_t arr[32] = {
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +    1, 2, 7, 1, 3, 4, 5, 3,
> +    1, 0, 1, 2, 4, 4, 9, 9,
> +  };
> +
> +  vuint8m1_t varr = __riscv_vle8_v_u8m1(arr, 32);
> +  vuint8m1_t vand_m = __riscv_vand_vx_u8m1(varr, 1, 32);
> +
> +  return __riscv_vreinterpret_v_u8m1_b8(vand_m);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
> new file mode 100644
> index 00000000000..b7ebef80954
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat32m1_t test () {
> +  float arr[32] = {
> +    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
> +    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
> +    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
> +    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
> +  };
> +
> +  return __riscv_vle32_v_f32m1(arr, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> +/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
> new file mode 100644
> index 00000000000..21fed06d201
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat64m8_t test () {
> +  double arr[8] = {
> +    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
> +  };
> +
> +  return __riscv_vle64_v_f64m8(arr, 4);
> +}
> +
> +/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> +/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2] DSE: Allow vector type for get_stored_val when read < store
  2023-11-02  3:14 [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720] pan2.li
  2023-11-02  8:19 ` Richard Biener
@ 2023-11-09  6:08 ` pan2.li
  2023-11-09 16:16   ` Jeff Law
  2023-11-12 12:27 ` [PATCH v3] " pan2.li
  2023-11-13  3:22 ` [PATCH v4] " pan2.li
  3 siblings, 1 reply; 18+ messages in thread
From: pan2.li @ 2023-11-09  6:08 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng, jeffreyalaw,
	richard.guenther, richard.sandiford

From: Pan Li <pan2.li@intel.com>

Update in v2:
* Move vector type support to get_stored_val.

Original log:

This patch would like to allow the vector mode in the
get_stored_val in the DSE. It is valid for the read
rtx if and only if the read bitsize is less than the
stored bitsize.

Given below example code with
--param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
  uint8_t arr[32] = {
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
  };

  return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:
test:
  lui     a5,%hi(.LANCHOR0)
  addi    sp,sp,-32
  addi    a5,a5,%lo(.LANCHOR0)
  li      a3,32
  vl2re64.v       v2,0(a5)
  vsetvli zero,a3,e8,m1,ta,ma
  vs2r.v  v2,0(sp)             <== Unnecessary store to stack
  vle8.v  v1,0(sp)             <== Ditto
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

After this patch:
test:
  lui     a5,%hi(.LANCHOR0)
  addi    a5,a5,%lo(.LANCHOR0)
  li      a4,32
  addi    sp,sp,-32
  vsetvli zero,a4,e8,m1,ta,ma
  vle8.v  v1,0(a5)
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

Below tests are passed within this patch:

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression test.

	PR target/111720

gcc/ChangeLog:

	* dse.cc (get_stored_val): Allow vector mode if the read
	bitsize is less than stored bitsize.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/dse.cc                                    |  4 ++++
 .../gcc.target/riscv/rvv/base/pr111720-0.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-1.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-2.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-3.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-4.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-5.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-6.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-7.c    | 21 +++++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-8.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-9.c    | 15 +++++++++++++
 12 files changed, 202 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c

diff --git a/gcc/dse.cc b/gcc/dse.cc
index 1a85dae1f8c..21004becd4a 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1940,6 +1940,10 @@ get_stored_val (store_info *store_info, machine_mode read_mode,
 	       || GET_MODE_CLASS (read_mode) != GET_MODE_CLASS (store_mode)))
     read_reg = extract_low_bits (read_mode, store_mode,
 				 copy_rtx (store_info->const_rhs));
+  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
+    && known_lt (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
+    && targetm.modes_tieable_p (read_mode, store_mode))
+    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
     read_reg = extract_low_bits (read_mode, store_mode,
 				 copy_rtx (store_info->rhs));
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
new file mode 100644
index 00000000000..a61e94a6d98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
new file mode 100644
index 00000000000..46efd7379ac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
new file mode 100644
index 00000000000..8bebac219a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool4_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vlm_v_b4(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
new file mode 100644
index 00000000000..47e4243e02e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 16);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
new file mode 100644
index 00000000000..5331e547ed3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 8);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
new file mode 100644
index 00000000000..0c728f93514
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8mf2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8mf2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
new file mode 100644
index 00000000000..ccfc40cd382
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 4);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
new file mode 100644
index 00000000000..ce7ddbb99b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m8(arr, 32);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
new file mode 100644
index 00000000000..ac0100a1211
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  vuint8m1_t varr = __riscv_vle8_v_u8m1(arr, 32);
+  vuint8m1_t vand_m = __riscv_vand_vx_u8m1(varr, 1, 32);
+
+  return __riscv_vreinterpret_v_u8m1_b8(vand_m);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
new file mode 100644
index 00000000000..b7ebef80954
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t test () {
+  float arr[32] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+  };
+
+  return __riscv_vle32_v_f32m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
new file mode 100644
index 00000000000..21fed06d201
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat64m8_t test () {
+  double arr[8] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+  };
+
+  return __riscv_vle64_v_f64m8(arr, 4);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store
  2023-11-09  6:08 ` [PATCH v2] DSE: Allow vector type for get_stored_val when read < store pan2.li
@ 2023-11-09 16:16   ` Jeff Law
  2023-11-11 15:23     ` Richard Sandiford
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Law @ 2023-11-09 16:16 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: juzhe.zhong, yanzhang.wang, kito.cheng, richard.guenther,
	richard.sandiford



On 11/8/23 23:08, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> Update in v2:
> * Move vector type support to get_stored_val.
> 
> Original log:
> 
> This patch would like to allow the vector mode in the
> get_stored_val in the DSE. It is valid for the read
> rtx if and only if the read bitsize is less than the
> stored bitsize.
> 
> Given below example code with
> --param=riscv-autovec-preference=fixed-vlmax.
> 
> vuint8m1_t test () {
>    uint8_t arr[32] = {
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>    };
> 
>    return __riscv_vle8_v_u8m1(arr, 32);
> }
> 
> Before this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    sp,sp,-32
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a3,32
>    vl2re64.v       v2,0(a5)
>    vsetvli zero,a3,e8,m1,ta,ma
>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>    vle8.v  v1,0(sp)             <== Ditto
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> After this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a4,32
>    addi    sp,sp,-32
>    vsetvli zero,a4,e8,m1,ta,ma
>    vle8.v  v1,0(a5)
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> Below tests are passed within this patch:
> 
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression test.
> 
> 	PR target/111720
> 
> gcc/ChangeLog:
> 
> 	* dse.cc (get_stored_val): Allow vector mode if the read
> 	bitsize is less than stored bitsize.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
We're always getting the lowpart here AFAICT and it appears that all the 
right thing should happen if gen_lowpart_common fails (it returns NULL, 
which bubbles up and is the right return value from get_stored_val if it 
can't be optimized).

Did you want to use known_le so that you'd pick up the case when the two 
modes are the same size?  Or was known_lt the test you really wanted 
(and if so, why).


OK using known_lt, or known_le.  If you decide to change to known_le, 
you'll need to bootstrap & regression test again on x86.



jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store
  2023-11-09 16:16   ` Jeff Law
@ 2023-11-11 15:23     ` Richard Sandiford
  2023-11-12  2:30       ` Li, Pan2
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2023-11-11 15:23 UTC (permalink / raw)
  To: Jeff Law
  Cc: pan2.li, gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng,
	richard.guenther

Jeff Law <jeffreyalaw@gmail.com> writes:
> On 11/8/23 23:08, pan2.li@intel.com wrote:
>> From: Pan Li <pan2.li@intel.com>
>> 
>> Update in v2:
>> * Move vector type support to get_stored_val.
>> 
>> Original log:
>> 
>> This patch would like to allow the vector mode in the
>> get_stored_val in the DSE. It is valid for the read
>> rtx if and only if the read bitsize is less than the
>> stored bitsize.
>> 
>> Given below example code with
>> --param=riscv-autovec-preference=fixed-vlmax.
>> 
>> vuint8m1_t test () {
>>    uint8_t arr[32] = {
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>    };
>> 
>>    return __riscv_vle8_v_u8m1(arr, 32);
>> }
>> 
>> Before this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    sp,sp,-32
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a3,32
>>    vl2re64.v       v2,0(a5)
>>    vsetvli zero,a3,e8,m1,ta,ma
>>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>>    vle8.v  v1,0(sp)             <== Ditto
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> After this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a4,32
>>    addi    sp,sp,-32
>>    vsetvli zero,a4,e8,m1,ta,ma
>>    vle8.v  v1,0(a5)
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> Below tests are passed within this patch:
>> 
>> * The x86 bootstrap and regression test.
>> * The aarch64 regression test.
>> * The risc-v regression test.
>> 
>> 	PR target/111720
>> 
>> gcc/ChangeLog:
>> 
>> 	* dse.cc (get_stored_val): Allow vector mode if the read
>> 	bitsize is less than stored bitsize.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> We're always getting the lowpart here AFAICT and it appears that all the 
> right thing should happen if gen_lowpart_common fails (it returns NULL, 
> which bubbles up and is the right return value from get_stored_val if it 
> can't be optimized).

Yeah, we should always be operating on the lowpart, but it looks
like there's a latent bug.  This check:

  if (gap.is_constant () && maybe_ne (gap, 0))
    {
      ...
    }
  else ...

means that we ignore the gap if it's a nonzero runtime value.
I guess it should be:

  if (maybe_ne (gap, 0))
    {
      if (!gap.is_constant ())
        return NULL_RTX;
      ...
    }

instead.  Alternatively, we could remove the is_constant condition
and fix PR87815 in a different way, e.g. by protecting the
smallest_int_mode_for_size with a tighter condition.  That might
allow a similar DSE optimisation to this patch for nonzero offsets,
thanks to:

      if (multiple_p (shift, GET_MODE_BITSIZE (new_mode))
	  && known_le (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)))
	{
	  /* Try to implement the shift using a subreg.  */
          ...

> Did you want to use known_le so that you'd pick up the case when the two 
> modes are the same size?  Or was known_lt the test you really wanted 
> (and if so, why).

Agree it should be known_le FWIW.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store
  2023-11-11 15:23     ` Richard Sandiford
@ 2023-11-12  2:30       ` Li, Pan2
  2023-11-13  3:25         ` Li, Pan2
  0 siblings, 1 reply; 18+ messages in thread
From: Li, Pan2 @ 2023-11-12  2:30 UTC (permalink / raw)
  To: Richard Sandiford, Jeff Law
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther

Thanks Richard S and Jeff for comments.

> Did you want to use known_le so that you'd pick up the case when the two 
> modes are the same size?  Or was known_lt the test you really wanted 
> (and if so, why).

Take known_lt in v2 due to consideration that leave the equal go to original code path.
Just have a try for known_le and got sorts of ICE when test, I bet it may be related to the
latent bug as Richard S mentioned.

> instead.  Alternatively, we could remove the is_constant condition
> and fix PR87815 in a different way, e.g. by protecting the
> smallest_int_mode_for_size with a tighter condition.  That might
> allow a similar DSE optimisation to this patch for nonzero offsets,
> thanks to:

Thus, looks like we should fix the PR87815 from the way suggested by Richard S, before
we take known_le for vector here.

I will have a try soon and keep you posted.

Pan

-----Original Message-----
From: Richard Sandiford <richard.sandiford@arm.com> 
Sent: Saturday, November 11, 2023 11:23 PM
To: Jeff Law <jeffreyalaw@gmail.com>
Cc: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com
Subject: Re: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store

Jeff Law <jeffreyalaw@gmail.com> writes:
> On 11/8/23 23:08, pan2.li@intel.com wrote:
>> From: Pan Li <pan2.li@intel.com>
>> 
>> Update in v2:
>> * Move vector type support to get_stored_val.
>> 
>> Original log:
>> 
>> This patch would like to allow the vector mode in the
>> get_stored_val in the DSE. It is valid for the read
>> rtx if and only if the read bitsize is less than the
>> stored bitsize.
>> 
>> Given below example code with
>> --param=riscv-autovec-preference=fixed-vlmax.
>> 
>> vuint8m1_t test () {
>>    uint8_t arr[32] = {
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>    };
>> 
>>    return __riscv_vle8_v_u8m1(arr, 32);
>> }
>> 
>> Before this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    sp,sp,-32
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a3,32
>>    vl2re64.v       v2,0(a5)
>>    vsetvli zero,a3,e8,m1,ta,ma
>>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>>    vle8.v  v1,0(sp)             <== Ditto
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> After this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a4,32
>>    addi    sp,sp,-32
>>    vsetvli zero,a4,e8,m1,ta,ma
>>    vle8.v  v1,0(a5)
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> Below tests are passed within this patch:
>> 
>> * The x86 bootstrap and regression test.
>> * The aarch64 regression test.
>> * The risc-v regression test.
>> 
>> 	PR target/111720
>> 
>> gcc/ChangeLog:
>> 
>> 	* dse.cc (get_stored_val): Allow vector mode if the read
>> 	bitsize is less than stored bitsize.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> We're always getting the lowpart here AFAICT and it appears that all the 
> right thing should happen if gen_lowpart_common fails (it returns NULL, 
> which bubbles up and is the right return value from get_stored_val if it 
> can't be optimized).

Yeah, we should always be operating on the lowpart, but it looks
like there's a latent bug.  This check:

  if (gap.is_constant () && maybe_ne (gap, 0))
    {
      ...
    }
  else ...

means that we ignore the gap if it's a nonzero runtime value.
I guess it should be:

  if (maybe_ne (gap, 0))
    {
      if (!gap.is_constant ())
        return NULL_RTX;
      ...
    }

instead.  Alternatively, we could remove the is_constant condition
and fix PR87815 in a different way, e.g. by protecting the
smallest_int_mode_for_size with a tighter condition.  That might
allow a similar DSE optimisation to this patch for nonzero offsets,
thanks to:

      if (multiple_p (shift, GET_MODE_BITSIZE (new_mode))
	  && known_le (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)))
	{
	  /* Try to implement the shift using a subreg.  */
          ...

> Did you want to use known_le so that you'd pick up the case when the two 
> modes are the same size?  Or was known_lt the test you really wanted 
> (and if so, why).

Agree it should be known_le FWIW.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3] DSE: Allow vector type for get_stored_val when read < store
  2023-11-02  3:14 [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720] pan2.li
  2023-11-02  8:19 ` Richard Biener
  2023-11-09  6:08 ` [PATCH v2] DSE: Allow vector type for get_stored_val when read < store pan2.li
@ 2023-11-12 12:27 ` pan2.li
  2023-11-13  3:22 ` [PATCH v4] " pan2.li
  3 siblings, 0 replies; 18+ messages in thread
From: pan2.li @ 2023-11-12 12:27 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng,
	richard.guenther, richard.sandiford, jeffreyalaw

From: Pan Li <pan2.li@intel.com>

Update in v3:
* Take known_le instead of known_lt for vector size.
* Return NULL_RTX when gap is not equal 0 and not constant.

Update in v2:
* Move vector type support to get_stored_val.

Original log:

This patch would like to allow the vector mode in the
get_stored_val in the DSE. It is valid for the read
rtx if and only if the read bitsize is less than the
stored bitsize.

Given below example code with
--param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
  uint8_t arr[32] = {
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
  };

  return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:
test:
  lui     a5,%hi(.LANCHOR0)
  addi    sp,sp,-32
  addi    a5,a5,%lo(.LANCHOR0)
  li      a3,32
  vl2re64.v       v2,0(a5)
  vsetvli zero,a3,e8,m1,ta,ma
  vs2r.v  v2,0(sp)             <== Unnecessary store to stack
  vle8.v  v1,0(sp)             <== Ditto
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

After this patch:
test:
  lui     a5,%hi(.LANCHOR0)
  addi    a5,a5,%lo(.LANCHOR0)
  li      a4,32
  addi    sp,sp,-32
  vsetvli zero,a4,e8,m1,ta,ma
  vle8.v  v1,0(a5)
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

Below tests are passed within this patch:
* The risc-v regression test.

Below tests are ongoing within this patch:
* The x86 bootstrap and regression test.
* The aarch64 regression test.

	PR target/111720

gcc/ChangeLog:

	* dse.cc (get_stored_val): Allow vector mode if read size is
	less than or equal to stored size.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-dynamic-frm-54.c: Adjust
	the asm checker.
	* gcc.target/riscv/rvv/base/float-point-dynamic-frm-57.c: Ditto.
	* gcc.target/riscv/rvv/base/float-point-dynamic-frm-58.c: Ditto.
	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/dse.cc                                    |  9 +++++++-
 .../rvv/base/float-point-dynamic-frm-54.c     |  2 +-
 .../rvv/base/float-point-dynamic-frm-57.c     |  2 +-
 .../rvv/base/float-point-dynamic-frm-58.c     |  2 +-
 .../gcc.target/riscv/rvv/base/pr111720-0.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-1.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-2.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-3.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-4.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-5.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-6.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-7.c    | 21 +++++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-8.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-9.c    | 15 +++++++++++++
 15 files changed, 209 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c

diff --git a/gcc/dse.cc b/gcc/dse.cc
index 1a85dae1f8c..40c4c29d07e 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1900,8 +1900,11 @@ get_stored_val (store_info *store_info, machine_mode read_mode,
   else
     gap = read_offset - store_info->offset;
 
-  if (gap.is_constant () && maybe_ne (gap, 0))
+  if (maybe_ne (gap, 0))
     {
+      if (!gap.is_constant ())
+	return NULL_RTX;
+
       poly_int64 shift = gap * BITS_PER_UNIT;
       poly_int64 access_size = GET_MODE_SIZE (read_mode) + gap;
       read_reg = find_shift_sequence (access_size, store_info, read_mode,
@@ -1940,6 +1943,10 @@ get_stored_val (store_info *store_info, machine_mode read_mode,
 	       || GET_MODE_CLASS (read_mode) != GET_MODE_CLASS (store_mode)))
     read_reg = extract_low_bits (read_mode, store_mode,
 				 copy_rtx (store_info->const_rhs));
+  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
+    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
+    && targetm.modes_tieable_p (read_mode, store_mode))
+    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
     read_reg = extract_low_bits (read_mode, store_mode,
 				 copy_rtx (store_info->rhs));
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-54.c b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-54.c
index 8c67d4bba81..f33f303c0cb 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-54.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-54.c
@@ -33,6 +33,6 @@ test_float_point_dynamic_frm (vfloat32m1_t op1, vfloat32m1_t op2,
 
 /* { dg-final { scan-assembler-times {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 4 } } */
 /* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 3 } } */
-/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
 /* { dg-final { scan-assembler-times {fsrmi\s+[01234]} 1 } } */
 /* { dg-final { scan-assembler-not {fsrmi\s+[axs][0-9]+,\s*[01234]} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-57.c b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-57.c
index 7ac9c960e65..cc0fb556da3 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-57.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-57.c
@@ -33,6 +33,6 @@ test_float_point_dynamic_frm (vfloat32m1_t op1, vfloat32m1_t op2,
 
 /* { dg-final { scan-assembler-times {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 4 } } */
 /* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 3 } } */
-/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
 /* { dg-final { scan-assembler-times {fsrmi\s+[01234]} 1 } } */
 /* { dg-final { scan-assembler-not {fsrmi\s+[axs][0-9]+,\s*[01234]} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-58.c b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-58.c
index c5f96bc45c0..c5c3408be30 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-58.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-58.c
@@ -33,6 +33,6 @@ test_float_point_dynamic_frm (vfloat32m1_t op1, vfloat32m1_t op2,
 
 /* { dg-final { scan-assembler-times {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 4 } } */
 /* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 3 } } */
-/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */
 /* { dg-final { scan-assembler-times {fsrmi\s+[01234]} 2 } } */
 /* { dg-final { scan-assembler-not {fsrmi\s+[axs][0-9]+,\s*[01234]} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
new file mode 100644
index 00000000000..a61e94a6d98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
new file mode 100644
index 00000000000..46efd7379ac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
new file mode 100644
index 00000000000..8bebac219a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool4_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vlm_v_b4(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
new file mode 100644
index 00000000000..47e4243e02e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 16);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
new file mode 100644
index 00000000000..5331e547ed3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 8);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
new file mode 100644
index 00000000000..0c728f93514
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8mf2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8mf2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
new file mode 100644
index 00000000000..ccfc40cd382
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 4);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
new file mode 100644
index 00000000000..ce7ddbb99b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m8(arr, 32);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
new file mode 100644
index 00000000000..ac0100a1211
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  vuint8m1_t varr = __riscv_vle8_v_u8m1(arr, 32);
+  vuint8m1_t vand_m = __riscv_vand_vx_u8m1(varr, 1, 32);
+
+  return __riscv_vreinterpret_v_u8m1_b8(vand_m);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
new file mode 100644
index 00000000000..b7ebef80954
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t test () {
+  float arr[32] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+  };
+
+  return __riscv_vle32_v_f32m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
new file mode 100644
index 00000000000..21fed06d201
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat64m8_t test () {
+  double arr[8] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+  };
+
+  return __riscv_vle64_v_f64m8(arr, 4);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-02  3:14 [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720] pan2.li
                   ` (2 preceding siblings ...)
  2023-11-12 12:27 ` [PATCH v3] " pan2.li
@ 2023-11-13  3:22 ` pan2.li
  2023-11-13 20:11   ` Jeff Law
  3 siblings, 1 reply; 18+ messages in thread
From: pan2.li @ 2023-11-13  3:22 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng,
	richard.guenther, richard.sandiford, jeffreyalaw

From: Pan Li <pan2.li@intel.com>

Update in v4:
* Merge upstream and removed some independent changes.

Update in v3:
* Take known_le instead of known_lt for vector size.
* Return NULL_RTX when gap is not equal 0 and not constant.

Update in v2:
* Move vector type support to get_stored_val.

Original log:

This patch would like to allow the vector mode in the
get_stored_val in the DSE. It is valid for the read
rtx if and only if the read bitsize is less than the
stored bitsize.

Given below example code with
--param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
  uint8_t arr[32] = {
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
    1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
  };

  return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:
test:
  lui     a5,%hi(.LANCHOR0)
  addi    sp,sp,-32
  addi    a5,a5,%lo(.LANCHOR0)
  li      a3,32
  vl2re64.v       v2,0(a5)
  vsetvli zero,a3,e8,m1,ta,ma
  vs2r.v  v2,0(sp)             <== Unnecessary store to stack
  vle8.v  v1,0(sp)             <== Ditto
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

After this patch:
test:
  lui     a5,%hi(.LANCHOR0)
  addi    a5,a5,%lo(.LANCHOR0)
  li      a4,32
  addi    sp,sp,-32
  vsetvli zero,a4,e8,m1,ta,ma
  vle8.v  v1,0(a5)
  vs1r.v  v1,0(a0)
  addi    sp,sp,32
  jr      ra

Below tests are passed within this patch:
* The risc-v regression test.
* The x86 bootstrap and regression test.
* The aarch64 regression test.

	PR target/111720

gcc/ChangeLog:

	* dse.cc (get_stored_val): Allow vector mode if read size is
	less than or equal to stored size.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/dse.cc                                    |  9 +++++++-
 .../gcc.target/riscv/rvv/base/pr111720-0.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-1.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-2.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-3.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-4.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-5.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-6.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-7.c    | 21 +++++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-8.c    | 18 ++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr111720-9.c    | 15 +++++++++++++
 12 files changed, 206 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c

diff --git a/gcc/dse.cc b/gcc/dse.cc
index 1a85dae1f8c..40c4c29d07e 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1900,8 +1900,11 @@ get_stored_val (store_info *store_info, machine_mode read_mode,
   else
     gap = read_offset - store_info->offset;
 
-  if (gap.is_constant () && maybe_ne (gap, 0))
+  if (maybe_ne (gap, 0))
     {
+      if (!gap.is_constant ())
+	return NULL_RTX;
+
       poly_int64 shift = gap * BITS_PER_UNIT;
       poly_int64 access_size = GET_MODE_SIZE (read_mode) + gap;
       read_reg = find_shift_sequence (access_size, store_info, read_mode,
@@ -1940,6 +1943,10 @@ get_stored_val (store_info *store_info, machine_mode read_mode,
 	       || GET_MODE_CLASS (read_mode) != GET_MODE_CLASS (store_mode)))
     read_reg = extract_low_bits (read_mode, store_mode,
 				 copy_rtx (store_info->const_rhs));
+  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
+    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
+    && targetm.modes_tieable_p (read_mode, store_mode))
+    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
     read_reg = extract_low_bits (read_mode, store_mode,
 				 copy_rtx (store_info->rhs));
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
new file mode 100644
index 00000000000..a61e94a6d98
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
new file mode 100644
index 00000000000..46efd7379ac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
new file mode 100644
index 00000000000..8bebac219a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool4_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vlm_v_b4(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
new file mode 100644
index 00000000000..47e4243e02e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m1_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m1(arr, 16);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
new file mode 100644
index 00000000000..5331e547ed3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 8);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
new file mode 100644
index 00000000000..0c728f93514
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8mf2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8mf2(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
new file mode 100644
index 00000000000..ccfc40cd382
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m2_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m2(arr, 4);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[09]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
new file mode 100644
index 00000000000..ce7ddbb99b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vuint8m8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  return __riscv_vle8_v_u8m8(arr, 32);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
new file mode 100644
index 00000000000..ac0100a1211
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vbool8_t test () {
+  uint8_t arr[32] = {
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+    1, 2, 7, 1, 3, 4, 5, 3,
+    1, 0, 1, 2, 4, 4, 9, 9,
+  };
+
+  vuint8m1_t varr = __riscv_vle8_v_u8m1(arr, 32);
+  vuint8m1_t vand_m = __riscv_vand_vx_u8m1(varr, 1, 32);
+
+  return __riscv_vreinterpret_v_u8m1_b8(vand_m);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
new file mode 100644
index 00000000000..b7ebef80954
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat32m1_t test () {
+  float arr[32] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+    1.0, 0.2, 1.8, 2.2, 4.3, 4.7, 9.5, 9.3,
+  };
+
+  return __riscv_vle32_v_f32m1(arr, 32);
+}
+
+/* { dg-final { scan-assembler-not {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
+/* { dg-final { scan-assembler-not {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
new file mode 100644
index 00000000000..21fed06d201
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -ftree-vectorize --param=riscv-autovec-preference=fixed-vlmax -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+vfloat64m8_t test () {
+  double arr[8] = {
+    1.0, 2.2, 7.8, 1.2, 3.3, 4.7, 5.5, 3.3,
+  };
+
+  return __riscv_vle64_v_f64m8(arr, 4);
+}
+
+/* { dg-final { scan-assembler-times {vle[0-9]+\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
+/* { dg-final { scan-assembler-times {vs[0-9]+r\.v\s+v[0-9]+,\s*[0-9]+\(sp\)} 1 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store
  2023-11-12  2:30       ` Li, Pan2
@ 2023-11-13  3:25         ` Li, Pan2
  0 siblings, 0 replies; 18+ messages in thread
From: Li, Pan2 @ 2023-11-13  3:25 UTC (permalink / raw)
  To: Richard Sandiford, Jeff Law
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther

Update v4 in below link, please help to ignore v3.

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636216.html

Sorry for inconvenience.

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Sunday, November 12, 2023 10:31 AM
To: Richard Sandiford <richard.sandiford@arm.com>; Jeff Law <jeffreyalaw@gmail.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com
Subject: RE: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store

Thanks Richard S and Jeff for comments.

> Did you want to use known_le so that you'd pick up the case when the two 
> modes are the same size?  Or was known_lt the test you really wanted 
> (and if so, why).

Take known_lt in v2 due to consideration that leave the equal go to original code path.
Just have a try for known_le and got sorts of ICE when test, I bet it may be related to the
latent bug as Richard S mentioned.

> instead.  Alternatively, we could remove the is_constant condition
> and fix PR87815 in a different way, e.g. by protecting the
> smallest_int_mode_for_size with a tighter condition.  That might
> allow a similar DSE optimisation to this patch for nonzero offsets,
> thanks to:

Thus, looks like we should fix the PR87815 from the way suggested by Richard S, before
we take known_le for vector here.

I will have a try soon and keep you posted.

Pan

-----Original Message-----
From: Richard Sandiford <richard.sandiford@arm.com> 
Sent: Saturday, November 11, 2023 11:23 PM
To: Jeff Law <jeffreyalaw@gmail.com>
Cc: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com
Subject: Re: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store

Jeff Law <jeffreyalaw@gmail.com> writes:
> On 11/8/23 23:08, pan2.li@intel.com wrote:
>> From: Pan Li <pan2.li@intel.com>
>> 
>> Update in v2:
>> * Move vector type support to get_stored_val.
>> 
>> Original log:
>> 
>> This patch would like to allow the vector mode in the
>> get_stored_val in the DSE. It is valid for the read
>> rtx if and only if the read bitsize is less than the
>> stored bitsize.
>> 
>> Given below example code with
>> --param=riscv-autovec-preference=fixed-vlmax.
>> 
>> vuint8m1_t test () {
>>    uint8_t arr[32] = {
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>    };
>> 
>>    return __riscv_vle8_v_u8m1(arr, 32);
>> }
>> 
>> Before this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    sp,sp,-32
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a3,32
>>    vl2re64.v       v2,0(a5)
>>    vsetvli zero,a3,e8,m1,ta,ma
>>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>>    vle8.v  v1,0(sp)             <== Ditto
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> After this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a4,32
>>    addi    sp,sp,-32
>>    vsetvli zero,a4,e8,m1,ta,ma
>>    vle8.v  v1,0(a5)
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> Below tests are passed within this patch:
>> 
>> * The x86 bootstrap and regression test.
>> * The aarch64 regression test.
>> * The risc-v regression test.
>> 
>> 	PR target/111720
>> 
>> gcc/ChangeLog:
>> 
>> 	* dse.cc (get_stored_val): Allow vector mode if the read
>> 	bitsize is less than stored bitsize.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> We're always getting the lowpart here AFAICT and it appears that all the 
> right thing should happen if gen_lowpart_common fails (it returns NULL, 
> which bubbles up and is the right return value from get_stored_val if it 
> can't be optimized).

Yeah, we should always be operating on the lowpart, but it looks
like there's a latent bug.  This check:

  if (gap.is_constant () && maybe_ne (gap, 0))
    {
      ...
    }
  else ...

means that we ignore the gap if it's a nonzero runtime value.
I guess it should be:

  if (maybe_ne (gap, 0))
    {
      if (!gap.is_constant ())
        return NULL_RTX;
      ...
    }

instead.  Alternatively, we could remove the is_constant condition
and fix PR87815 in a different way, e.g. by protecting the
smallest_int_mode_for_size with a tighter condition.  That might
allow a similar DSE optimisation to this patch for nonzero offsets,
thanks to:

      if (multiple_p (shift, GET_MODE_BITSIZE (new_mode))
	  && known_le (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)))
	{
	  /* Try to implement the shift using a subreg.  */
          ...

> Did you want to use known_le so that you'd pick up the case when the two 
> modes are the same size?  Or was known_lt the test you really wanted 
> (and if so, why).

Agree it should be known_le FWIW.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-13  3:22 ` [PATCH v4] " pan2.li
@ 2023-11-13 20:11   ` Jeff Law
  2023-11-15  0:18     ` Li, Pan2
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Law @ 2023-11-13 20:11 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: juzhe.zhong, yanzhang.wang, kito.cheng, richard.guenther,
	richard.sandiford



On 11/12/23 20:22, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> Update in v4:
> * Merge upstream and removed some independent changes.
> 
> Update in v3:
> * Take known_le instead of known_lt for vector size.
> * Return NULL_RTX when gap is not equal 0 and not constant.
> 
> Update in v2:
> * Move vector type support to get_stored_val.
> 
> Original log:
> 
> This patch would like to allow the vector mode in the
> get_stored_val in the DSE. It is valid for the read
> rtx if and only if the read bitsize is less than the
> stored bitsize.
> 
> Given below example code with
> --param=riscv-autovec-preference=fixed-vlmax.
> 
> vuint8m1_t test () {
>    uint8_t arr[32] = {
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>    };
> 
>    return __riscv_vle8_v_u8m1(arr, 32);
> }
> 
> Before this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    sp,sp,-32
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a3,32
>    vl2re64.v       v2,0(a5)
>    vsetvli zero,a3,e8,m1,ta,ma
>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>    vle8.v  v1,0(sp)             <== Ditto
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> After this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a4,32
>    addi    sp,sp,-32
>    vsetvli zero,a4,e8,m1,ta,ma
>    vle8.v  v1,0(a5)
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> Below tests are passed within this patch:
> * The risc-v regression test.
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> 
> 	PR target/111720
> 
> gcc/ChangeLog:
> 
> 	* dse.cc (get_stored_val): Allow vector mode if read size is
> 	less than or equal to stored size.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
OK for the trunk.


> 

> +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
> +    && targetm.modes_tieable_p (read_mode, store_mode))
> +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
>     else
>       read_reg = extract_low_bits (read_mode, store_mode,
>   				 copy_rtx (store_info->rhs));
It may not matter, especially for RV, but we could possibly have a 
mixture of scalar and vector modes in the RTL.  Say a vector store 
followed by a scalar read or vice-versa.

I wouldn't try to handle that case unless we had actual evidence it was 
useful to do so.  Just wanted to point out that unlike pseudos we can 
have multiple modes referencing the same memory location.

Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-13 20:11   ` Jeff Law
@ 2023-11-15  0:18     ` Li, Pan2
  2023-11-15  0:24       ` Li, Pan2
  0 siblings, 1 reply; 18+ messages in thread
From: Li, Pan2 @ 2023-11-15  0:18 UTC (permalink / raw)
  To: Jeff Law, gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther,
	richard.sandiford

> I wouldn't try to handle that case unless we had actual evidence it was 
> useful to do so.  Just wanted to point out that unlike pseudos we can 
> have multiple modes referencing the same memory location.

Got the point here, thanks Jeff for emphasizing this, 😉.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Tuesday, November 14, 2023 4:12 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store



On 11/12/23 20:22, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> Update in v4:
> * Merge upstream and removed some independent changes.
> 
> Update in v3:
> * Take known_le instead of known_lt for vector size.
> * Return NULL_RTX when gap is not equal 0 and not constant.
> 
> Update in v2:
> * Move vector type support to get_stored_val.
> 
> Original log:
> 
> This patch would like to allow the vector mode in the
> get_stored_val in the DSE. It is valid for the read
> rtx if and only if the read bitsize is less than the
> stored bitsize.
> 
> Given below example code with
> --param=riscv-autovec-preference=fixed-vlmax.
> 
> vuint8m1_t test () {
>    uint8_t arr[32] = {
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>    };
> 
>    return __riscv_vle8_v_u8m1(arr, 32);
> }
> 
> Before this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    sp,sp,-32
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a3,32
>    vl2re64.v       v2,0(a5)
>    vsetvli zero,a3,e8,m1,ta,ma
>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>    vle8.v  v1,0(sp)             <== Ditto
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> After this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a4,32
>    addi    sp,sp,-32
>    vsetvli zero,a4,e8,m1,ta,ma
>    vle8.v  v1,0(a5)
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> Below tests are passed within this patch:
> * The risc-v regression test.
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> 
> 	PR target/111720
> 
> gcc/ChangeLog:
> 
> 	* dse.cc (get_stored_val): Allow vector mode if read size is
> 	less than or equal to stored size.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
OK for the trunk.


> 

> +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
> +    && targetm.modes_tieable_p (read_mode, store_mode))
> +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
>     else
>       read_reg = extract_low_bits (read_mode, store_mode,
>   				 copy_rtx (store_info->rhs));
It may not matter, especially for RV, but we could possibly have a 
mixture of scalar and vector modes in the RTL.  Say a vector store 
followed by a scalar read or vice-versa.

I wouldn't try to handle that case unless we had actual evidence it was 
useful to do so.  Just wanted to point out that unlike pseudos we can 
have multiple modes referencing the same memory location.

Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-15  0:18     ` Li, Pan2
@ 2023-11-15  0:24       ` Li, Pan2
  2023-11-22  2:30         ` Li, Pan2
  0 siblings, 1 reply; 18+ messages in thread
From: Li, Pan2 @ 2023-11-15  0:24 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther,
	richard.sandiford, Jeff Law

Sorry for disturbing, looks I have a typo for Richard S's email address, cc the right email address for awareness.

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Wednesday, November 15, 2023 8:18 AM
To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

> I wouldn't try to handle that case unless we had actual evidence it was 
> useful to do so.  Just wanted to point out that unlike pseudos we can 
> have multiple modes referencing the same memory location.

Got the point here, thanks Jeff for emphasizing this, 😉.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Tuesday, November 14, 2023 4:12 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store



On 11/12/23 20:22, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> Update in v4:
> * Merge upstream and removed some independent changes.
> 
> Update in v3:
> * Take known_le instead of known_lt for vector size.
> * Return NULL_RTX when gap is not equal 0 and not constant.
> 
> Update in v2:
> * Move vector type support to get_stored_val.
> 
> Original log:
> 
> This patch would like to allow the vector mode in the
> get_stored_val in the DSE. It is valid for the read
> rtx if and only if the read bitsize is less than the
> stored bitsize.
> 
> Given below example code with
> --param=riscv-autovec-preference=fixed-vlmax.
> 
> vuint8m1_t test () {
>    uint8_t arr[32] = {
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>    };
> 
>    return __riscv_vle8_v_u8m1(arr, 32);
> }
> 
> Before this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    sp,sp,-32
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a3,32
>    vl2re64.v       v2,0(a5)
>    vsetvli zero,a3,e8,m1,ta,ma
>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>    vle8.v  v1,0(sp)             <== Ditto
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> After this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a4,32
>    addi    sp,sp,-32
>    vsetvli zero,a4,e8,m1,ta,ma
>    vle8.v  v1,0(a5)
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> Below tests are passed within this patch:
> * The risc-v regression test.
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> 
> 	PR target/111720
> 
> gcc/ChangeLog:
> 
> 	* dse.cc (get_stored_val): Allow vector mode if read size is
> 	less than or equal to stored size.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
OK for the trunk.


> 

> +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
> +    && targetm.modes_tieable_p (read_mode, store_mode))
> +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
>     else
>       read_reg = extract_low_bits (read_mode, store_mode,
>   				 copy_rtx (store_info->rhs));
It may not matter, especially for RV, but we could possibly have a 
mixture of scalar and vector modes in the RTL.  Say a vector store 
followed by a scalar read or vice-versa.

I wouldn't try to handle that case unless we had actual evidence it was 
useful to do so.  Just wanted to point out that unlike pseudos we can 
have multiple modes referencing the same memory location.

Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-15  0:24       ` Li, Pan2
@ 2023-11-22  2:30         ` Li, Pan2
  2023-11-22  8:02           ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Li, Pan2 @ 2023-11-22  2:30 UTC (permalink / raw)
  To: richard.sandiford
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther,
	richard.sandiford, Jeff Law, gcc-patches

Hi Richard S,

Thanks a lot for reviewing and comments. May I know is there any concern or further comments for landing this patch to GCC-14?

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Wednesday, November 15, 2023 8:25 AM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com; Jeff Law <jeffreyalaw@gmail.com>
Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

Sorry for disturbing, looks I have a typo for Richard S's email address, cc the right email address for awareness.

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Wednesday, November 15, 2023 8:18 AM
To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

> I wouldn't try to handle that case unless we had actual evidence it was 
> useful to do so.  Just wanted to point out that unlike pseudos we can 
> have multiple modes referencing the same memory location.

Got the point here, thanks Jeff for emphasizing this, 😉.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Tuesday, November 14, 2023 4:12 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store



On 11/12/23 20:22, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> Update in v4:
> * Merge upstream and removed some independent changes.
> 
> Update in v3:
> * Take known_le instead of known_lt for vector size.
> * Return NULL_RTX when gap is not equal 0 and not constant.
> 
> Update in v2:
> * Move vector type support to get_stored_val.
> 
> Original log:
> 
> This patch would like to allow the vector mode in the
> get_stored_val in the DSE. It is valid for the read
> rtx if and only if the read bitsize is less than the
> stored bitsize.
> 
> Given below example code with
> --param=riscv-autovec-preference=fixed-vlmax.
> 
> vuint8m1_t test () {
>    uint8_t arr[32] = {
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>    };
> 
>    return __riscv_vle8_v_u8m1(arr, 32);
> }
> 
> Before this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    sp,sp,-32
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a3,32
>    vl2re64.v       v2,0(a5)
>    vsetvli zero,a3,e8,m1,ta,ma
>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>    vle8.v  v1,0(sp)             <== Ditto
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> After this patch:
> test:
>    lui     a5,%hi(.LANCHOR0)
>    addi    a5,a5,%lo(.LANCHOR0)
>    li      a4,32
>    addi    sp,sp,-32
>    vsetvli zero,a4,e8,m1,ta,ma
>    vle8.v  v1,0(a5)
>    vs1r.v  v1,0(a0)
>    addi    sp,sp,32
>    jr      ra
> 
> Below tests are passed within this patch:
> * The risc-v regression test.
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> 
> 	PR target/111720
> 
> gcc/ChangeLog:
> 
> 	* dse.cc (get_stored_val): Allow vector mode if read size is
> 	less than or equal to stored size.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> 	* gcc.target/riscv/rvv/base/pr111720-9.c: New test.
OK for the trunk.


> 

> +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
> +    && targetm.modes_tieable_p (read_mode, store_mode))
> +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
>     else
>       read_reg = extract_low_bits (read_mode, store_mode,
>   				 copy_rtx (store_info->rhs));
It may not matter, especially for RV, but we could possibly have a 
mixture of scalar and vector modes in the RTL.  Say a vector store 
followed by a scalar read or vice-versa.

I wouldn't try to handle that case unless we had actual evidence it was 
useful to do so.  Just wanted to point out that unlike pseudos we can 
have multiple modes referencing the same memory location.

Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-22  2:30         ` Li, Pan2
@ 2023-11-22  8:02           ` Richard Biener
  2023-11-22 11:38             ` Li, Pan2
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2023-11-22  8:02 UTC (permalink / raw)
  To: Li, Pan2
  Cc: richard.sandiford, juzhe.zhong, Wang, Yanzhang, kito.cheng,
	Jeff Law, gcc-patches

On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> Hi Richard S,
>
> Thanks a lot for reviewing and comments. May I know is there any concern or further comments for landing this patch to GCC-14?

It looks like Jeff approved the patch?

Richard.

> Pan
>
> -----Original Message-----
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:25 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com; Jeff Law <jeffreyalaw@gmail.com>
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
> Sorry for disturbing, looks I have a typo for Richard S's email address, cc the right email address for awareness.
>
> Pan
>
> -----Original Message-----
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:18 AM
> To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
> > I wouldn't try to handle that case unless we had actual evidence it was
> > useful to do so.  Just wanted to point out that unlike pseudos we can
> > have multiple modes referencing the same memory location.
>
> Got the point here, thanks Jeff for emphasizing this, 😉.
>
> Pan
>
> -----Original Message-----
> From: Jeff Law <jeffreyalaw@gmail.com>
> Sent: Tuesday, November 14, 2023 4:12 AM
> To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
>
>
> On 11/12/23 20:22, pan2.li@intel.com wrote:
> > From: Pan Li <pan2.li@intel.com>
> >
> > Update in v4:
> > * Merge upstream and removed some independent changes.
> >
> > Update in v3:
> > * Take known_le instead of known_lt for vector size.
> > * Return NULL_RTX when gap is not equal 0 and not constant.
> >
> > Update in v2:
> > * Move vector type support to get_stored_val.
> >
> > Original log:
> >
> > This patch would like to allow the vector mode in the
> > get_stored_val in the DSE. It is valid for the read
> > rtx if and only if the read bitsize is less than the
> > stored bitsize.
> >
> > Given below example code with
> > --param=riscv-autovec-preference=fixed-vlmax.
> >
> > vuint8m1_t test () {
> >    uint8_t arr[32] = {
> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >    };
> >
> >    return __riscv_vle8_v_u8m1(arr, 32);
> > }
> >
> > Before this patch:
> > test:
> >    lui     a5,%hi(.LANCHOR0)
> >    addi    sp,sp,-32
> >    addi    a5,a5,%lo(.LANCHOR0)
> >    li      a3,32
> >    vl2re64.v       v2,0(a5)
> >    vsetvli zero,a3,e8,m1,ta,ma
> >    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
> >    vle8.v  v1,0(sp)             <== Ditto
> >    vs1r.v  v1,0(a0)
> >    addi    sp,sp,32
> >    jr      ra
> >
> > After this patch:
> > test:
> >    lui     a5,%hi(.LANCHOR0)
> >    addi    a5,a5,%lo(.LANCHOR0)
> >    li      a4,32
> >    addi    sp,sp,-32
> >    vsetvli zero,a4,e8,m1,ta,ma
> >    vle8.v  v1,0(a5)
> >    vs1r.v  v1,0(a0)
> >    addi    sp,sp,32
> >    jr      ra
> >
> > Below tests are passed within this patch:
> > * The risc-v regression test.
> > * The x86 bootstrap and regression test.
> > * The aarch64 regression test.
> >
> >       PR target/111720
> >
> > gcc/ChangeLog:
> >
> >       * dse.cc (get_stored_val): Allow vector mode if read size is
> >       less than or equal to stored size.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> OK for the trunk.
>
>
> >
>
> > +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> > +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
> > +    && targetm.modes_tieable_p (read_mode, store_mode))
> > +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
> >     else
> >       read_reg = extract_low_bits (read_mode, store_mode,
> >                                copy_rtx (store_info->rhs));
> It may not matter, especially for RV, but we could possibly have a
> mixture of scalar and vector modes in the RTL.  Say a vector store
> followed by a scalar read or vice-versa.
>
> I wouldn't try to handle that case unless we had actual evidence it was
> useful to do so.  Just wanted to point out that unlike pseudos we can
> have multiple modes referencing the same memory location.
>
> Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-22  8:02           ` Richard Biener
@ 2023-11-22 11:38             ` Li, Pan2
  2023-11-22 18:39               ` Richard Sandiford
  0 siblings, 1 reply; 18+ messages in thread
From: Li, Pan2 @ 2023-11-22 11:38 UTC (permalink / raw)
  To: Richard Biener
  Cc: richard.sandiford, juzhe.zhong, Wang, Yanzhang, kito.cheng,
	Jeff Law, gcc-patches

> It looks like Jeff approved the patch?

Yes, just would like to double check the way of this patch is expected as following the suggestion of Richard S.

Pan

-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Wednesday, November 22, 2023 4:02 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: richard.sandiford@arm.com; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> Hi Richard S,
>
> Thanks a lot for reviewing and comments. May I know is there any concern or further comments for landing this patch to GCC-14?

It looks like Jeff approved the patch?

Richard.

> Pan
>
> -----Original Message-----
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:25 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com; Jeff Law <jeffreyalaw@gmail.com>
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
> Sorry for disturbing, looks I have a typo for Richard S's email address, cc the right email address for awareness.
>
> Pan
>
> -----Original Message-----
> From: Li, Pan2
> Sent: Wednesday, November 15, 2023 8:18 AM
> To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
> > I wouldn't try to handle that case unless we had actual evidence it was
> > useful to do so.  Just wanted to point out that unlike pseudos we can
> > have multiple modes referencing the same memory location.
>
> Got the point here, thanks Jeff for emphasizing this, 😉.
>
> Pan
>
> -----Original Message-----
> From: Jeff Law <jeffreyalaw@gmail.com>
> Sent: Tuesday, November 14, 2023 4:12 AM
> To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
>
>
> On 11/12/23 20:22, pan2.li@intel.com wrote:
> > From: Pan Li <pan2.li@intel.com>
> >
> > Update in v4:
> > * Merge upstream and removed some independent changes.
> >
> > Update in v3:
> > * Take known_le instead of known_lt for vector size.
> > * Return NULL_RTX when gap is not equal 0 and not constant.
> >
> > Update in v2:
> > * Move vector type support to get_stored_val.
> >
> > Original log:
> >
> > This patch would like to allow the vector mode in the
> > get_stored_val in the DSE. It is valid for the read
> > rtx if and only if the read bitsize is less than the
> > stored bitsize.
> >
> > Given below example code with
> > --param=riscv-autovec-preference=fixed-vlmax.
> >
> > vuint8m1_t test () {
> >    uint8_t arr[32] = {
> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> >    };
> >
> >    return __riscv_vle8_v_u8m1(arr, 32);
> > }
> >
> > Before this patch:
> > test:
> >    lui     a5,%hi(.LANCHOR0)
> >    addi    sp,sp,-32
> >    addi    a5,a5,%lo(.LANCHOR0)
> >    li      a3,32
> >    vl2re64.v       v2,0(a5)
> >    vsetvli zero,a3,e8,m1,ta,ma
> >    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
> >    vle8.v  v1,0(sp)             <== Ditto
> >    vs1r.v  v1,0(a0)
> >    addi    sp,sp,32
> >    jr      ra
> >
> > After this patch:
> > test:
> >    lui     a5,%hi(.LANCHOR0)
> >    addi    a5,a5,%lo(.LANCHOR0)
> >    li      a4,32
> >    addi    sp,sp,-32
> >    vsetvli zero,a4,e8,m1,ta,ma
> >    vle8.v  v1,0(a5)
> >    vs1r.v  v1,0(a0)
> >    addi    sp,sp,32
> >    jr      ra
> >
> > Below tests are passed within this patch:
> > * The risc-v regression test.
> > * The x86 bootstrap and regression test.
> > * The aarch64 regression test.
> >
> >       PR target/111720
> >
> > gcc/ChangeLog:
> >
> >       * dse.cc (get_stored_val): Allow vector mode if read size is
> >       less than or equal to stored size.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> >       * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> OK for the trunk.
>
>
> >
>
> > +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
> > +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
> > +    && targetm.modes_tieable_p (read_mode, store_mode))
> > +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
> >     else
> >       read_reg = extract_low_bits (read_mode, store_mode,
> >                                copy_rtx (store_info->rhs));
> It may not matter, especially for RV, but we could possibly have a
> mixture of scalar and vector modes in the RTL.  Say a vector store
> followed by a scalar read or vice-versa.
>
> I wouldn't try to handle that case unless we had actual evidence it was
> useful to do so.  Just wanted to point out that unlike pseudos we can
> have multiple modes referencing the same memory location.
>
> Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-22 11:38             ` Li, Pan2
@ 2023-11-22 18:39               ` Richard Sandiford
  2023-11-23  1:20                 ` Li, Pan2
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2023-11-22 18:39 UTC (permalink / raw)
  To: Li, Pan2
  Cc: Richard Biener, juzhe.zhong, Wang, Yanzhang, kito.cheng,
	Jeff Law, gcc-patches

"Li, Pan2" <pan2.li@intel.com> writes:
>> It looks like Jeff approved the patch?
>
> Yes, just would like to double check the way of this patch is expected as following the suggestion of Richard S.

Yeah, it looks good to me, thanks.

Richard

> Pan
>
> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com> 
> Sent: Wednesday, November 22, 2023 4:02 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: richard.sandiford@arm.com; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
> On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2 <pan2.li@intel.com> wrote:
>>
>> Hi Richard S,
>>
>> Thanks a lot for reviewing and comments. May I know is there any concern or further comments for landing this patch to GCC-14?
>
> It looks like Jeff approved the patch?
>
> Richard.
>
>> Pan
>>
>> -----Original Message-----
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:25 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com; Jeff Law <jeffreyalaw@gmail.com>
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>>
>> Sorry for disturbing, looks I have a typo for Richard S's email address, cc the right email address for awareness.
>>
>> Pan
>>
>> -----Original Message-----
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:18 AM
>> To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>>
>> > I wouldn't try to handle that case unless we had actual evidence it was
>> > useful to do so.  Just wanted to point out that unlike pseudos we can
>> > have multiple modes referencing the same memory location.
>>
>> Got the point here, thanks Jeff for emphasizing this, 😉.
>>
>> Pan
>>
>> -----Original Message-----
>> From: Jeff Law <jeffreyalaw@gmail.com>
>> Sent: Tuesday, November 14, 2023 4:12 AM
>> To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
>> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>>
>>
>>
>> On 11/12/23 20:22, pan2.li@intel.com wrote:
>> > From: Pan Li <pan2.li@intel.com>
>> >
>> > Update in v4:
>> > * Merge upstream and removed some independent changes.
>> >
>> > Update in v3:
>> > * Take known_le instead of known_lt for vector size.
>> > * Return NULL_RTX when gap is not equal 0 and not constant.
>> >
>> > Update in v2:
>> > * Move vector type support to get_stored_val.
>> >
>> > Original log:
>> >
>> > This patch would like to allow the vector mode in the
>> > get_stored_val in the DSE. It is valid for the read
>> > rtx if and only if the read bitsize is less than the
>> > stored bitsize.
>> >
>> > Given below example code with
>> > --param=riscv-autovec-preference=fixed-vlmax.
>> >
>> > vuint8m1_t test () {
>> >    uint8_t arr[32] = {
>> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >    };
>> >
>> >    return __riscv_vle8_v_u8m1(arr, 32);
>> > }
>> >
>> > Before this patch:
>> > test:
>> >    lui     a5,%hi(.LANCHOR0)
>> >    addi    sp,sp,-32
>> >    addi    a5,a5,%lo(.LANCHOR0)
>> >    li      a3,32
>> >    vl2re64.v       v2,0(a5)
>> >    vsetvli zero,a3,e8,m1,ta,ma
>> >    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>> >    vle8.v  v1,0(sp)             <== Ditto
>> >    vs1r.v  v1,0(a0)
>> >    addi    sp,sp,32
>> >    jr      ra
>> >
>> > After this patch:
>> > test:
>> >    lui     a5,%hi(.LANCHOR0)
>> >    addi    a5,a5,%lo(.LANCHOR0)
>> >    li      a4,32
>> >    addi    sp,sp,-32
>> >    vsetvli zero,a4,e8,m1,ta,ma
>> >    vle8.v  v1,0(a5)
>> >    vs1r.v  v1,0(a0)
>> >    addi    sp,sp,32
>> >    jr      ra
>> >
>> > Below tests are passed within this patch:
>> > * The risc-v regression test.
>> > * The x86 bootstrap and regression test.
>> > * The aarch64 regression test.
>> >
>> >       PR target/111720
>> >
>> > gcc/ChangeLog:
>> >
>> >       * dse.cc (get_stored_val): Allow vector mode if read size is
>> >       less than or equal to stored size.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >       * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
>> OK for the trunk.
>>
>>
>> >
>>
>> > +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
>> > +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
>> > +    && targetm.modes_tieable_p (read_mode, store_mode))
>> > +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
>> >     else
>> >       read_reg = extract_low_bits (read_mode, store_mode,
>> >                                copy_rtx (store_info->rhs));
>> It may not matter, especially for RV, but we could possibly have a
>> mixture of scalar and vector modes in the RTL.  Say a vector store
>> followed by a scalar read or vice-versa.
>>
>> I wouldn't try to handle that case unless we had actual evidence it was
>> useful to do so.  Just wanted to point out that unlike pseudos we can
>> have multiple modes referencing the same memory location.
>>
>> Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
  2023-11-22 18:39               ` Richard Sandiford
@ 2023-11-23  1:20                 ` Li, Pan2
  0 siblings, 0 replies; 18+ messages in thread
From: Li, Pan2 @ 2023-11-23  1:20 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Richard Biener, juzhe.zhong, Wang, Yanzhang, kito.cheng,
	Jeff Law, gcc-patches

Committed, thanks all.

Pan

-----Original Message-----
From: Richard Sandiford <richard.sandiford@arm.com> 
Sent: Thursday, November 23, 2023 2:39 AM
To: Li, Pan2 <pan2.li@intel.com>
Cc: Richard Biener <richard.guenther@gmail.com>; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

"Li, Pan2" <pan2.li@intel.com> writes:
>> It looks like Jeff approved the patch?
>
> Yes, just would like to double check the way of this patch is expected as following the suggestion of Richard S.

Yeah, it looks good to me, thanks.

Richard

> Pan
>
> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com> 
> Sent: Wednesday, November 22, 2023 4:02 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: richard.sandiford@arm.com; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>
> On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2 <pan2.li@intel.com> wrote:
>>
>> Hi Richard S,
>>
>> Thanks a lot for reviewing and comments. May I know is there any concern or further comments for landing this patch to GCC-14?
>
> It looks like Jeff approved the patch?
>
> Richard.
>
>> Pan
>>
>> -----Original Message-----
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:25 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com; Jeff Law <jeffreyalaw@gmail.com>
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>>
>> Sorry for disturbing, looks I have a typo for Richard S's email address, cc the right email address for awareness.
>>
>> Pan
>>
>> -----Original Message-----
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:18 AM
>> To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>>
>> > I wouldn't try to handle that case unless we had actual evidence it was
>> > useful to do so.  Just wanted to point out that unlike pseudos we can
>> > have multiple modes referencing the same memory location.
>>
>> Got the point here, thanks Jeff for emphasizing this, 😉.
>>
>> Pan
>>
>> -----Original Message-----
>> From: Jeff Law <jeffreyalaw@gmail.com>
>> Sent: Tuesday, November 14, 2023 4:12 AM
>> To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com; richard.sandiford@arm.com2
>> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store
>>
>>
>>
>> On 11/12/23 20:22, pan2.li@intel.com wrote:
>> > From: Pan Li <pan2.li@intel.com>
>> >
>> > Update in v4:
>> > * Merge upstream and removed some independent changes.
>> >
>> > Update in v3:
>> > * Take known_le instead of known_lt for vector size.
>> > * Return NULL_RTX when gap is not equal 0 and not constant.
>> >
>> > Update in v2:
>> > * Move vector type support to get_stored_val.
>> >
>> > Original log:
>> >
>> > This patch would like to allow the vector mode in the
>> > get_stored_val in the DSE. It is valid for the read
>> > rtx if and only if the read bitsize is less than the
>> > stored bitsize.
>> >
>> > Given below example code with
>> > --param=riscv-autovec-preference=fixed-vlmax.
>> >
>> > vuint8m1_t test () {
>> >    uint8_t arr[32] = {
>> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >    };
>> >
>> >    return __riscv_vle8_v_u8m1(arr, 32);
>> > }
>> >
>> > Before this patch:
>> > test:
>> >    lui     a5,%hi(.LANCHOR0)
>> >    addi    sp,sp,-32
>> >    addi    a5,a5,%lo(.LANCHOR0)
>> >    li      a3,32
>> >    vl2re64.v       v2,0(a5)
>> >    vsetvli zero,a3,e8,m1,ta,ma
>> >    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>> >    vle8.v  v1,0(sp)             <== Ditto
>> >    vs1r.v  v1,0(a0)
>> >    addi    sp,sp,32
>> >    jr      ra
>> >
>> > After this patch:
>> > test:
>> >    lui     a5,%hi(.LANCHOR0)
>> >    addi    a5,a5,%lo(.LANCHOR0)
>> >    li      a4,32
>> >    addi    sp,sp,-32
>> >    vsetvli zero,a4,e8,m1,ta,ma
>> >    vle8.v  v1,0(a5)
>> >    vs1r.v  v1,0(a0)
>> >    addi    sp,sp,32
>> >    jr      ra
>> >
>> > Below tests are passed within this patch:
>> > * The risc-v regression test.
>> > * The x86 bootstrap and regression test.
>> > * The aarch64 regression test.
>> >
>> >       PR target/111720
>> >
>> > gcc/ChangeLog:
>> >
>> >       * dse.cc (get_stored_val): Allow vector mode if read size is
>> >       less than or equal to stored size.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >       * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>> >       * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
>> OK for the trunk.
>>
>>
>> >
>>
>> > +  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
>> > +    && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
>> > +    && targetm.modes_tieable_p (read_mode, store_mode))
>> > +    read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
>> >     else
>> >       read_reg = extract_low_bits (read_mode, store_mode,
>> >                                copy_rtx (store_info->rhs));
>> It may not matter, especially for RV, but we could possibly have a
>> mixture of scalar and vector modes in the RTL.  Say a vector store
>> followed by a scalar read or vice-versa.
>>
>> I wouldn't try to handle that case unless we had actual evidence it was
>> useful to do so.  Just wanted to point out that unlike pseudos we can
>> have multiple modes referencing the same memory location.
>>
>> Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-11-23  1:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-02  3:14 [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720] pan2.li
2023-11-02  8:19 ` Richard Biener
2023-11-02 12:17   ` Li, Pan2
2023-11-09  6:08 ` [PATCH v2] DSE: Allow vector type for get_stored_val when read < store pan2.li
2023-11-09 16:16   ` Jeff Law
2023-11-11 15:23     ` Richard Sandiford
2023-11-12  2:30       ` Li, Pan2
2023-11-13  3:25         ` Li, Pan2
2023-11-12 12:27 ` [PATCH v3] " pan2.li
2023-11-13  3:22 ` [PATCH v4] " pan2.li
2023-11-13 20:11   ` Jeff Law
2023-11-15  0:18     ` Li, Pan2
2023-11-15  0:24       ` Li, Pan2
2023-11-22  2:30         ` Li, Pan2
2023-11-22  8:02           ` Richard Biener
2023-11-22 11:38             ` Li, Pan2
2023-11-22 18:39               ` Richard Sandiford
2023-11-23  1:20                 ` Li, Pan2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).