public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
@ 2023-06-28  9:47 Juzhe-Zhong
  2023-06-28 18:11 ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Juzhe-Zhong @ 2023-06-28  9:47 UTC (permalink / raw)
  To: gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, jeffreyalaw, rdapp.gcc,
	Juzhe-Zhong

This bug blocks the following patches.

GCC doesn't know RVV is using compact mask model.
Consider this following case:

#define N 16

int
main ()
{
  int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
  int8_t out[N] = {0};
  for (int8_t i = 0; i < N; ++i)
    if (mask[i])
      out[i] = i;
  for (int8_t i = 0; i < N; ++i)
    {
      if (mask[i])
	assert (out[i] == i);
      else
	assert (out[i] == 0);
    }
}

Before this patch, the pre-calculated mask in constant memory pool:
.LC1:
        .byte   68 ====> 0b01000100

This is incorrect, such case failed in execution.

After this patch:
.LC1:
	.byte	10 ====> 0b1010

Pass on exection.

After diging into this issue, I figure such bug only happens on VNx1BI, VNx2BI and VNx4BI.
The reason as follows:
/* Return true if the BITSIZE and PRECISION are not equal.

   This helper function tests BITSIZE and PRECISION on RVV mask modes.

   For VNx1BI/VNx2BI/VNx4BI modes, since they are having same BYTESIZE
   with VNx8BI and compiler can not differentiate them when they are having
   same BYTESIZE which will cause incorrect DCE/DSE for them.

   To differentiate VNx1BI/VNx2BI/VNx4BI/VNx8BI, we use ADJUST_PRECISION
   in riscv-modes.def to adjust different PRECISION for them.

   Such approach works fine that compiler can differentiate them, but it causes
   incorrect organization of bitmask memory layout.

     E.g mask = { 0, -1 } for VNx2BI, the PRECISION will let compiler adjust
     bitmask memory layout: 0b0001 which is incorrect for RVV.

     Instead, we want to see the correct bitmask memory layout: 0b01.
     In this situation, we let RISC-V backend to re-organize the bitmask
     memory layout in "mov<mode>" pattern.
*/

So here we add a helper function "bitsize_precision_unequal_p" to force RISC-V backend re-organize
bitmask memory layout of VNx1BI, VNx2BI, VNx4BI since their PRECISION != BITSIZE.
I don't use mode == VNx1BI || mode == VNx2BI || mode == VNx4BI since we are going to have VLS modes.
maybe_ne (GET_MODE_BITSIZE (mode), GET_MODE_PRECISION (mode)) can cover any case including VLA and VLS.

gcc/ChangeLog:

        * config/riscv/riscv-v.cc (rvv_builder::get_compact_mask): New function.
        (expand_const_vector): Fix bug.
        * config/riscv/riscv.cc (bitsize_precision_unequal_p): New function.
        (riscv_const_insns): Fix bug.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c: New test.
        * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c: New test.

---
 gcc/config/riscv/riscv-v.cc                   | 64 +++++++++++++++++--
 gcc/config/riscv/riscv.cc                     | 36 +++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-1.c   | 23 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-10.c  | 22 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-11.c  | 23 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-12.c  | 23 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-13.c  | 23 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-14.c  | 24 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-2.c   | 23 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-3.c   | 23 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-4.c   | 23 +++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-5.c   | 25 ++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-6.c   | 27 ++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-7.c   | 30 +++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-8.c   | 30 +++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-9.c   | 30 +++++++++
 16 files changed, 444 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index adb8d7d36a5..5da0dc5e998 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -291,6 +291,7 @@ public:
 
   bool single_step_npatterns_p () const;
   bool npatterns_all_equal_p () const;
+  rtx get_compact_mask () const;
 
   machine_mode new_mode () const { return m_new_mode; }
   scalar_mode inner_mode () const { return m_inner_mode; }
@@ -505,6 +506,47 @@ rvv_builder::npatterns_all_equal_p () const
   return true;
 }
 
+/* Generate the compact mask.
+
+     E.g: mask = { 0, -1 }, mode = VNx2BI, bitsize = 128bits.
+
+	  GCC by default will generate the mask = 0b00000001xxxxx.
+
+	  However, it's not expected mask for RVV since RVV
+	  prefers the compact mask = 0b10xxxxx.
+*/
+rtx
+rvv_builder::get_compact_mask () const
+{
+  /* If TARGET_MIN_VLEN == 32, the minimum LMUL = 1/4.
+     Otherwise, the minimum LMUL = 1/8.  */
+  unsigned min_lmul = TARGET_MIN_VLEN == 32 ? 4 : 8;
+  unsigned min_container_size
+    = BYTES_PER_RISCV_VECTOR.to_constant () / min_lmul;
+  unsigned container_size = MAX (CEIL (npatterns (), 8), min_container_size);
+  machine_mode container_mode
+    = get_vector_mode (QImode, container_size).require ();
+
+  unsigned nunits = GET_MODE_NUNITS (container_mode).to_constant ();
+  rtvec v = rtvec_alloc (nunits);
+  for (unsigned i = 0; i < nunits; i++)
+    RTVEC_ELT (v, i) = const0_rtx;
+
+  unsigned char b = 0;
+  for (unsigned i = 0; i < npatterns (); i++)
+    {
+      if (INTVAL (elt (i)))
+	b = b | (1 << (i % 8));
+
+      if ((i > 0 && (i % 8) == 7) || (i == (npatterns () - 1)))
+	{
+	  RTVEC_ELT (v, ((i + 7) / 8) - 1) = gen_int_mode (b, QImode);
+	  b = 0;
+	}
+    }
+  return gen_rtx_CONST_VECTOR (container_mode, v);
+}
+
 static unsigned
 get_sew (machine_mode mode)
 {
@@ -1141,11 +1183,23 @@ expand_const_vector (rtx target, rtx src)
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
     {
       rtx elt;
-      gcc_assert (
-	const_vec_duplicate_p (src, &elt)
-	&& (rtx_equal_p (elt, const0_rtx) || rtx_equal_p (elt, const1_rtx)));
-      rtx ops[] = {target, src};
-      emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
+      unsigned int nelts;
+      if (const_vec_duplicate_p (src, &elt))
+	{
+	  rtx ops[] = {target, src};
+	  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
+	}
+      else if (GET_MODE_NUNITS (mode).is_constant (&nelts))
+	{
+	  rvv_builder builder (mode, nelts, 1);
+	  for (unsigned int i = 0; i < nelts; i++)
+	    builder.quick_push (CONST_VECTOR_ELT (src, i));
+	  rtx mask = builder.get_compact_mask ();
+	  rtx mem = validize_mem (force_const_mem (GET_MODE (mask), mask));
+	  emit_move_insn (target, gen_rtx_MEM (mode, XEXP (mem, 0)));
+	}
+      else
+	gcc_unreachable ();
       return;
     }
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 280aa0b33b9..13a9f98f30a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1244,6 +1244,36 @@ riscv_address_insns (rtx x, machine_mode mode, bool might_split_p)
   return n;
 }
 
+/* Return true if the BITSIZE and PRECISION are not equal.
+
+   This helper function tests BITSIZE and PRECISION on RVV mask modes.
+
+   For VNx1BI/VNx2BI/VNx4BI modes, since they are having same BYTESIZE
+   with VNx8BI and compiler can not differentiate them when they are having
+   same BYTESIZE which will cause incorrect DCE/DSE for them.
+
+   To differentiate VNx1BI/VNx2BI/VNx4BI/VNx8BI, we use ADJUST_PRECISION
+   in riscv-modes.def to adjust different PRECISION for them.
+
+   Such approach works fine that compiler can differentiate them, but it causes
+   incorrect organization of bitmask memory layout.
+
+     E.g mask = { 0, -1 } for VNx2BI, the PRECISION will let compiler adjust
+     bitmask memory layout: 0b0001 which is incorrect for RVV.
+
+     Instead, we want to see the correct bitmask memory layout: 0b01.
+     In this situation, we let RISC-V backend to re-organize the bitmask
+     memory layout in "mov<mode>" pattern.
+*/
+static bool
+bitsize_precision_unequal_p (machine_mode mode)
+{
+  /* We don't need to worry about non-BOOL vector modes for RVV.  */
+  if (GET_MODE_CLASS (mode) != MODE_VECTOR_BOOL)
+    return false;
+  return maybe_ne (GET_MODE_BITSIZE (mode), GET_MODE_PRECISION (mode));
+}
+
 /* Return the number of instructions needed to load constant X.
    Return 0 if X isn't a valid constant.  */
 
@@ -1323,6 +1353,12 @@ riscv_const_insns (rtx x)
 		      return 1 + 4; /*vmv.v.x + memory access.  */
 		  }
 	      }
+
+	    /* GCC doesn't known RVV is using compact model of mask,
+	       we should by default handle mask in mov<mode> pattern.  */
+	    if (bitsize_precision_unequal_p (GET_MODE (x)))
+	      /* TODO: We can adjust it according real cost model of vlm.v.  */
+	      return 1;
 	  }
 
 	/* TODO: We may support more const vector in the future.  */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c
new file mode 100644
index 00000000000..81229fd62b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int64_t out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c
new file mode 100644
index 00000000000..d891f3c16e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c
@@ -0,0 +1,22 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m2" } */
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c
new file mode 100644
index 00000000000..535641443ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 4
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c
new file mode 100644
index 00000000000..a7c12c3797b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m2" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 8
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c
new file mode 100644
index 00000000000..726238c1cd8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m4" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c
new file mode 100644
index 00000000000..c369cf0b268
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c
@@ -0,0 +1,24 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m8" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 32
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c
new file mode 100644
index 00000000000..a23e47171bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c
new file mode 100644
index 00000000000..6ea8fdd89c0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int16_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int16_t out[N] = {0};
+  for (int16_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int16_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c
new file mode 100644
index 00000000000..2d97c26abfd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c
new file mode 100644
index 00000000000..b89b70e99a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c
@@ -0,0 +1,25 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m2 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 32
+
+int
+main ()
+{
+  int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c
new file mode 100644
index 00000000000..ac8d91e793b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m4 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 64
+
+int
+main ()
+{
+  int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c
new file mode 100644
index 00000000000..f538db23b1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c
@@ -0,0 +1,30 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 128
+
+int
+main ()
+{
+  uint8_t mask[N]
+    = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  uint8_t out[N] = {0};
+  for (uint8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (uint8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c
new file mode 100644
index 00000000000..5abb34c1686
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c
@@ -0,0 +1,30 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 128
+
+int
+main ()
+{
+  int mask[N]
+    = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c
new file mode 100644
index 00000000000..6fdaa516534
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c
@@ -0,0 +1,30 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 128
+
+int
+main ()
+{
+  int64_t mask[N]
+    = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int64_t out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
-- 
2.36.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-28  9:47 [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI Juzhe-Zhong
@ 2023-06-28 18:11 ` Jeff Law
  2023-06-28 19:02   ` 钟居哲
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff Law @ 2023-06-28 18:11 UTC (permalink / raw)
  To: Juzhe-Zhong, gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 1213 bytes --]



On 6/28/23 03:47, Juzhe-Zhong wrote:
> This bug blocks the following patches.
> 
> GCC doesn't know RVV is using compact mask model.
> Consider this following case:
> 
> #define N 16
> 
> int
> main ()
> {
>    int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
>    int8_t out[N] = {0};
>    for (int8_t i = 0; i < N; ++i)
>      if (mask[i])
>        out[i] = i;
>    for (int8_t i = 0; i < N; ++i)
>      {
>        if (mask[i])
> 	assert (out[i] == i);
>        else
> 	assert (out[i] == 0);
>      }
> }
> 
> Before this patch, the pre-calculated mask in constant memory pool:
> .LC1:
>          .byte   68 ====> 0b01000100
> 
> This is incorrect, such case failed in execution.
> 
> After this patch:
> .LC1:
> 	.byte	10 ====> 0b1010
So I don't get anything like this in my testing.  What are the precise 
arguments you're using to build the testcase?

I'm compiling the test use a trunk compiler with

  -O3 --param riscv-autovec-preference=fixed-vlmax -march=rv64gcv

I get the attached code both before and after your patch.  Clearly I'm 
doing something different/wrong.    So my request is for the precise 
command line you're using and the before/after resulting assembly code.

Jeff

[-- Attachment #2: j.s --]
[-- Type: text/plain, Size: 2415 bytes --]

	.file	"j.c"
	.option nopic
	.attribute arch, "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0"
	.attribute unaligned_access, 0
	.attribute stack_align, 16
	.text
	.section	.rodata.str1.8,"aMS",@progbits,1
	.align	3
.LC1:
	.string	"j.c"
	.align	3
.LC2:
	.string	"out[i] == i"
	.align	3
.LC3:
	.string	"out[i] == 0"
	.section	.text.startup,"ax",@progbits
	.align	1
	.globl	main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	lui	a5,%hi(.LANCHOR0)
	addi	a5,a5,%lo(.LANCHOR0)
	ld	a4,0(a5)
	ld	a5,8(a5)
	addi	sp,sp,-48
	.cfi_def_cfa_offset 48
	vsetivli	zero,16,e8,m1,ta,ma
	sd	zero,16(sp)
	sd	a4,0(sp)
	sd	a5,8(sp)
	sd	ra,40(sp)
	.cfi_offset 1, -8
	addi	a5,sp,16
	sd	zero,24(sp)
	vid.v	v1
	vl1re8.v	v0,0(sp)
	vmsne.vi	v0,v0,0
	vsetvli	a4,zero,e8,m1,ta,ma
	vse8.v	v1,0(a5),v0.t
	lbu	a5,16(sp)
	bne	a5,zero,.L2
	lbu	a4,17(sp)
	li	a5,1
	bne	a4,a5,.L3
	lbu	a5,18(sp)
	bne	a5,zero,.L2
	lbu	a4,19(sp)
	li	a5,3
	bne	a4,a5,.L3
	lbu	a5,20(sp)
	bne	a5,zero,.L2
	lbu	a4,21(sp)
	li	a5,5
	bne	a4,a5,.L3
	lbu	a5,22(sp)
	bne	a5,zero,.L2
	lbu	a4,23(sp)
	li	a5,7
	bne	a4,a5,.L3
	lbu	a5,24(sp)
	bne	a5,zero,.L2
	lbu	a4,25(sp)
	li	a5,9
	bne	a4,a5,.L3
	lbu	a5,26(sp)
	bne	a5,zero,.L2
	lbu	a4,27(sp)
	li	a5,11
	bne	a4,a5,.L3
	lbu	a5,28(sp)
	bne	a5,zero,.L2
	lbu	a4,29(sp)
	li	a5,13
	bne	a4,a5,.L3
	lbu	a5,30(sp)
	bne	a5,zero,.L2
	lbu	a4,31(sp)
	li	a5,15
	bne	a4,a5,.L3
	ld	ra,40(sp)
	.cfi_remember_state
	.cfi_restore 1
	li	a0,0
	addi	sp,sp,48
	.cfi_def_cfa_offset 0
	jr	ra
.L2:
	.cfi_restore_state
	lui	a3,%hi(__PRETTY_FUNCTION__.0)
	lui	a1,%hi(.LC1)
	lui	a0,%hi(.LC3)
	addi	a3,a3,%lo(__PRETTY_FUNCTION__.0)
	li	a2,18
	addi	a1,a1,%lo(.LC1)
	addi	a0,a0,%lo(.LC3)
	call	__assert_fail
.L3:
	lui	a3,%hi(__PRETTY_FUNCTION__.0)
	lui	a1,%hi(.LC1)
	lui	a0,%hi(.LC2)
	addi	a3,a3,%lo(__PRETTY_FUNCTION__.0)
	li	a2,16
	addi	a1,a1,%lo(.LC1)
	addi	a0,a0,%lo(.LC2)
	call	__assert_fail
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.section	.rodata
	.align	3
	.set	.LANCHOR0,. + 0
.LC0:
	.string	""
	.string	"\001"
	.string	"\001"
	.string	"\001"
	.string	"\001"
	.string	"\001"
	.string	"\001"
	.string	"\001"
	.ascii	"\001"
	.section	.srodata,"a"
	.align	3
	.type	__PRETTY_FUNCTION__.0, @object
	.size	__PRETTY_FUNCTION__.0, 5
__PRETTY_FUNCTION__.0:
	.string	"main"
	.ident	"GCC: (GNU) 14.0.0 20230628 (experimental)"
	.section	.note.GNU-stack,"",@progbits

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-28 18:11 ` Jeff Law
@ 2023-06-28 19:02   ` 钟居哲
  2023-06-28 19:12     ` Robin Dapp
  0 siblings, 1 reply; 21+ messages in thread
From: 钟居哲 @ 2023-06-28 19:02 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]

Try this:
https://godbolt.org/z/x7bM5Pr84 




juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-29 02:11
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
 
 
On 6/28/23 03:47, Juzhe-Zhong wrote:
> This bug blocks the following patches.
> 
> GCC doesn't know RVV is using compact mask model.
> Consider this following case:
> 
> #define N 16
> 
> int
> main ()
> {
>    int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
>    int8_t out[N] = {0};
>    for (int8_t i = 0; i < N; ++i)
>      if (mask[i])
>        out[i] = i;
>    for (int8_t i = 0; i < N; ++i)
>      {
>        if (mask[i])
> assert (out[i] == i);
>        else
> assert (out[i] == 0);
>      }
> }
> 
> Before this patch, the pre-calculated mask in constant memory pool:
> .LC1:
>          .byte   68 ====> 0b01000100
> 
> This is incorrect, such case failed in execution.
> 
> After this patch:
> .LC1:
> .byte 10 ====> 0b1010
So I don't get anything like this in my testing.  What are the precise 
arguments you're using to build the testcase?
 
I'm compiling the test use a trunk compiler with
 
  -O3 --param riscv-autovec-preference=fixed-vlmax -march=rv64gcv
 
I get the attached code both before and after your patch.  Clearly I'm 
doing something different/wrong.    So my request is for the precise 
command line you're using and the before/after resulting assembly code.
 
Jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-28 19:02   ` 钟居哲
@ 2023-06-28 19:12     ` Robin Dapp
  2023-06-28 20:42       ` Richard Sandiford
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Dapp @ 2023-06-28 19:12 UTC (permalink / raw)
  To: 钟居哲, Jeff Law, gcc-patches
  Cc: rdapp.gcc, kito.cheng, kito.cheng, palmer, palmer

Hi Juzhe,

I find the bug description rather confusing.  What I can see is that
the constant in the literal pool is indeed wrong but how would DSE or
so play a role there?  Particularly only for the smaller modes?

My suspicion would be that the constant in the literal/constant pool
is wrong from start to finish.

I just played around with the following hunk:

diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 542315f88cd..5223c08924f 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -4061,7 +4061,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
           whole element.  Often this is byte_mode and contains more
           than one element.  */
        unsigned int nelts = GET_MODE_NUNITS (mode);
-       unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
+       unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts;
        unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
        scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require ();

With this all your examples pass for me.  We then pack e.g. 16 VNx2BI elements
into an int and not just 8.  It would also explain why it works for modes
where PRECISION == BITSIZE.  Now it will certainly require a more thorough
analysis but maybe it's a start?

Regards
 Robin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-28 19:12     ` Robin Dapp
@ 2023-06-28 20:42       ` Richard Sandiford
  2023-06-28 21:46         ` 钟居哲
  2023-06-29  7:53         ` Richard Sandiford
  0 siblings, 2 replies; 21+ messages in thread
From: Richard Sandiford @ 2023-06-28 20:42 UTC (permalink / raw)
  To: Robin Dapp via Gcc-patches
  Cc: 钟居哲,
	Jeff Law, Robin Dapp, kito.cheng, kito.cheng, palmer, palmer

Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi Juzhe,
>
> I find the bug description rather confusing.  What I can see is that
> the constant in the literal pool is indeed wrong but how would DSE or
> so play a role there?  Particularly only for the smaller modes?
>
> My suspicion would be that the constant in the literal/constant pool
> is wrong from start to finish.
>
> I just played around with the following hunk:
>
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 542315f88cd..5223c08924f 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -4061,7 +4061,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
>            whole element.  Often this is byte_mode and contains more
>            than one element.  */
>         unsigned int nelts = GET_MODE_NUNITS (mode);
> -       unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
> +       unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts;
>         unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
>         scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require ();
>
> With this all your examples pass for me.  We then pack e.g. 16 VNx2BI elements
> into an int and not just 8.  It would also explain why it works for modes
> where PRECISION == BITSIZE.  Now it will certainly require a more thorough
> analysis but maybe it's a start?

Yeah.  Preapproved for trunk & any necessary branches.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-28 20:42       ` Richard Sandiford
@ 2023-06-28 21:46         ` 钟居哲
  2023-06-29  7:53         ` Richard Sandiford
  1 sibling, 0 replies; 21+ messages in thread
From: 钟居哲 @ 2023-06-28 21:46 UTC (permalink / raw)
  To: richard.sandiford, gcc-patches
  Cc: Jeff Law, rdapp.gcc, kito.cheng, kito.cheng, palmer, palmer

[-- Attachment #1: Type: text/plain, Size: 1900 bytes --]

Ok. Plz go ahead commit this change with the testcases.
Then it won't block the following patches.

Thanks.


juzhe.zhong@rivai.ai
 
From: Richard Sandiford
Date: 2023-06-29 04:42
To: Robin Dapp via Gcc-patches
CC: 钟居哲; Jeff Law; Robin Dapp; kito.cheng; kito.cheng; palmer; palmer
Subject: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi Juzhe,
>
> I find the bug description rather confusing.  What I can see is that
> the constant in the literal pool is indeed wrong but how would DSE or
> so play a role there?  Particularly only for the smaller modes?
>
> My suspicion would be that the constant in the literal/constant pool
> is wrong from start to finish.
>
> I just played around with the following hunk:
>
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 542315f88cd..5223c08924f 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -4061,7 +4061,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
>            whole element.  Often this is byte_mode and contains more
>            than one element.  */
>         unsigned int nelts = GET_MODE_NUNITS (mode);
> -       unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
> +       unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts;
>         unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
>         scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require ();
>
> With this all your examples pass for me.  We then pack e.g. 16 VNx2BI elements
> into an int and not just 8.  It would also explain why it works for modes
> where PRECISION == BITSIZE.  Now it will certainly require a more thorough
> analysis but maybe it's a start?
 
Yeah.  Preapproved for trunk & any necessary branches.
 
Thanks,
Richard
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-28 20:42       ` Richard Sandiford
  2023-06-28 21:46         ` 钟居哲
@ 2023-06-29  7:53         ` Richard Sandiford
  2023-06-29  8:08           ` juzhe.zhong
  2023-06-29  8:14           ` Robin Dapp
  1 sibling, 2 replies; 21+ messages in thread
From: Richard Sandiford @ 2023-06-29  7:53 UTC (permalink / raw)
  To: Robin Dapp via Gcc-patches
  Cc: 钟居哲,
	Jeff Law, Robin Dapp, kito.cheng, kito.cheng, palmer, palmer

Richard Sandiford <richard.sandiford@arm.com> writes:
> Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> Hi Juzhe,
>>
>> I find the bug description rather confusing.  What I can see is that
>> the constant in the literal pool is indeed wrong but how would DSE or
>> so play a role there?  Particularly only for the smaller modes?
>>
>> My suspicion would be that the constant in the literal/constant pool
>> is wrong from start to finish.
>>
>> I just played around with the following hunk:
>>
>> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
>> index 542315f88cd..5223c08924f 100644
>> --- a/gcc/varasm.cc
>> +++ b/gcc/varasm.cc
>> @@ -4061,7 +4061,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
>>            whole element.  Often this is byte_mode and contains more
>>            than one element.  */
>>         unsigned int nelts = GET_MODE_NUNITS (mode);
>> -       unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
>> +       unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts;
>>         unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
>>         scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require ();
>>
>> With this all your examples pass for me.  We then pack e.g. 16 VNx2BI elements
>> into an int and not just 8.  It would also explain why it works for modes
>> where PRECISION == BITSIZE.  Now it will certainly require a more thorough
>> analysis but maybe it's a start?
>
> Yeah.  Preapproved for trunk & any necessary branches.

Sorry, only realised later, but: if the precision can cover fewer
bytes than the bitsize, I suppose there ought to be some zero-byte
padding at the end as well.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  7:53         ` Richard Sandiford
@ 2023-06-29  8:08           ` juzhe.zhong
  2023-06-29  8:14           ` Robin Dapp
  1 sibling, 0 replies; 21+ messages in thread
From: juzhe.zhong @ 2023-06-29  8:08 UTC (permalink / raw)
  To: richard.sandiford, gcc-patches
  Cc: jeffreyalaw, Robin Dapp, kito.cheng, Kito.cheng, palmer, palmer

[-- Attachment #1: Type: text/plain, Size: 2909 bytes --]

Yes. There is a trick fix in RVV.

Ideally, each mode should have PRECISION == BITSIZE. However, for RVV, there is a bug which cause incorrect DSE.
We have VNx1BI (occupy 1bit), VNx2BI (occupy 2bit), VNx4BI (occupy 4bit), VNx8BI (occupy 8bit),  since they are having same BYTESIZE,
it cause incorrect DSE.

So we add a trick (ADJUST_PRECISION) to fix it:
https://github.com/gcc-mirror/gcc/commit/247cacc9e381d666a492dfa4ed61b7b19e2d008f 
which will prevent the incorrect DSE.

But the maskbit layout in memory comes wrong since the inconsistency between PRECISION and BITSIZE. 
So, I force GCC handle this in the RISC-V backend for VNx1BI/VNx2BI/VNx4BI.

I think this is RISC-V backend issue and can be well addressed in RISC-V port (as this patch I post). 
No need to bother generic codes since other target could not have the same issues.

Thanks.


juzhe.zhong@rivai.ai
 
From: Richard Sandiford
Date: 2023-06-29 15:53
To: Robin Dapp via Gcc-patches
CC: 钟居哲; Jeff Law; Robin Dapp; kito.cheng; kito.cheng; palmer; palmer
Subject: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
Richard Sandiford <richard.sandiford@arm.com> writes:
> Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> Hi Juzhe,
>>
>> I find the bug description rather confusing.  What I can see is that
>> the constant in the literal pool is indeed wrong but how would DSE or
>> so play a role there?  Particularly only for the smaller modes?
>>
>> My suspicion would be that the constant in the literal/constant pool
>> is wrong from start to finish.
>>
>> I just played around with the following hunk:
>>
>> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
>> index 542315f88cd..5223c08924f 100644
>> --- a/gcc/varasm.cc
>> +++ b/gcc/varasm.cc
>> @@ -4061,7 +4061,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
>>            whole element.  Often this is byte_mode and contains more
>>            than one element.  */
>>         unsigned int nelts = GET_MODE_NUNITS (mode);
>> -       unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
>> +       unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts;
>>         unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
>>         scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require ();
>>
>> With this all your examples pass for me.  We then pack e.g. 16 VNx2BI elements
>> into an int and not just 8.  It would also explain why it works for modes
>> where PRECISION == BITSIZE.  Now it will certainly require a more thorough
>> analysis but maybe it's a start?
>
> Yeah.  Preapproved for trunk & any necessary branches.
 
Sorry, only realised later, but: if the precision can cover fewer
bytes than the bitsize, I suppose there ought to be some zero-byte
padding at the end as well.
 
Thanks,
Richard
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  7:53         ` Richard Sandiford
  2023-06-29  8:08           ` juzhe.zhong
@ 2023-06-29  8:14           ` Robin Dapp
  2023-06-29  8:18             ` juzhe.zhong
  2023-06-29  8:54             ` Richard Sandiford
  1 sibling, 2 replies; 21+ messages in thread
From: Robin Dapp @ 2023-06-29  8:14 UTC (permalink / raw)
  To: Robin Dapp via Gcc-patches, 钟居哲,
	Jeff Law, kito.cheng, kito.cheng, palmer, palmer,
	richard.sandiford
  Cc: rdapp.gcc

I grep'ed a bit and found several more instances of the same pattern
which would probably all have to be adjusted (frontend-related mostly
but also in native_encode_rtx).  Most likely they would all have to
be adjusted? 

> Sorry, only realised later, but: if the precision can cover fewer
> bytes than the bitsize, I suppose there ought to be some zero-byte
> padding at the end as well.
It looks like this problem, and also the padding, has been discussed
before when the precision of VNx1BI etc. was first adjusted in the
RISC-V backend?

I didn't immediately get the padding, though.  So if we e.g. have a
VNx2BI constant {0, 1} what would we pad the resulting value "2" to?
A full byte?

Juzhe, are we absolutely sure this is the only problem we will have
with precision != bitsize and it is confined to the backend?  I would
not dare to make that call.  How does DSE come in here at all as you
keep mentioning it?

Regards
 Robin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  8:14           ` Robin Dapp
@ 2023-06-29  8:18             ` juzhe.zhong
  2023-06-29  8:53               ` Robin Dapp
  2023-06-29  8:54             ` Richard Sandiford
  1 sibling, 1 reply; 21+ messages in thread
From: juzhe.zhong @ 2023-06-29  8:18 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, jeffreyalaw, kito.cheng, Kito.cheng,
	palmer, palmer, richard.sandiford
  Cc: Robin Dapp

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

>> are we absolutely sure this is the only problem we will have
>> with precision != bitsize and it is confined to the backend?
Yes.


 >>I would
>>not dare to make that call.  How does DSE come in here at all as you
>>keep mentioning it?
I mentioned DSE is because:
We have DSE issue before so we use ADJUST_PRECISION to make PRECISON  != BITSIZE but we still to walk around this DSE issue:
https://github.com/gcc-mirror/gcc/commit/247cacc9e381d666a492dfa4ed61b7b19e2d008f 

However, this fix patch fixed DSE issue which makes PRECISON != BITSIZE, then GCC will generate padding bits for it which we
don't want it.



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-06-29 16:14
To: Robin Dapp via Gcc-patches; 钟居哲; Jeff Law; kito.cheng; kito.cheng; palmer; palmer; richard.sandiford
CC: rdapp.gcc
Subject: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
I grep'ed a bit and found several more instances of the same pattern
which would probably all have to be adjusted (frontend-related mostly
but also in native_encode_rtx).  Most likely they would all have to
be adjusted? 
 
> Sorry, only realised later, but: if the precision can cover fewer
> bytes than the bitsize, I suppose there ought to be some zero-byte
> padding at the end as well.
It looks like this problem, and also the padding, has been discussed
before when the precision of VNx1BI etc. was first adjusted in the
RISC-V backend?
 
I didn't immediately get the padding, though.  So if we e.g. have a
VNx2BI constant {0, 1} what would we pad the resulting value "2" to?
A full byte?
 
Juzhe, are we absolutely sure this is the only problem we will have
with precision != bitsize and it is confined to the backend?  I would
not dare to make that call.  How does DSE come in here at all as you
keep mentioning it?
 
Regards
Robin
 
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  8:18             ` juzhe.zhong
@ 2023-06-29  8:53               ` Robin Dapp
  2023-06-29  9:01                 ` juzhe.zhong
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Dapp @ 2023-06-29  8:53 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches, jeffreyalaw, kito.cheng, Kito.cheng,
	palmer, palmer, richard.sandiford
  Cc: rdapp.gcc

>>> are we absolutely sure this is the only problem we will have
>>> with precision != bitsize and it is confined to the backend?
> Yes.

With vinfo.vector_mode == VNx4SI
 mask_type = get_mask_type_for_scalar_type (vinfo, int)
mask_type is:
 vector(4) <signed-boolean:2>

I.e. the precision is 2.  This is definitely fishy and related
to the same problem.  I would almost bet that something in the
middle-end relies on the precision for some optimization but
we just haven't hit it yet.

Then we have
 vector(2) <signed-boolean:4> (precision 4)
as a mask type for vector(2) long int.

Likewise we would likely have a precision of 8 for a vector(1)?
Those might be less severe but still...

And that's just what I'm seeing spontaneously after like five
minutes.

Regards
 Robin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  8:14           ` Robin Dapp
  2023-06-29  8:18             ` juzhe.zhong
@ 2023-06-29  8:54             ` Richard Sandiford
  2023-06-29  9:09               ` Robin Dapp
  1 sibling, 1 reply; 21+ messages in thread
From: Richard Sandiford @ 2023-06-29  8:54 UTC (permalink / raw)
  To: Robin Dapp
  Cc: Robin Dapp via Gcc-patches, 钟居哲,
	Jeff Law, kito.cheng, kito.cheng, palmer, palmer

Robin Dapp <rdapp.gcc@gmail.com> writes:
>> Sorry, only realised later, but: if the precision can cover fewer
>> bytes than the bitsize, I suppose there ought to be some zero-byte
>> padding at the end as well.
> It looks like this problem, and also the padding, has been discussed
> before when the precision of VNx1BI etc. was first adjusted in the
> RISC-V backend?

Very probably.  Can't remember now.

> I didn't immediately get the padding, though.  So if we e.g. have a
> VNx2BI constant {0, 1} what would we pad the resulting value "2" to?
> A full byte?

Yeah, that part is OK, and was the case I was thinking about when
I said OK yesterday.  But now that we allow BITSIZE != PRECISION,
it's possible for BITSIZE - PRECISION to be more than a full byte,
in which case the new loop would not initialise every byte of
the mode.

I vaguely remembered that that could happen for RVV_FIXED_VLMAX,
but perhaps I misremember.  If it can't happen then an assert
would be OK instead.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  8:53               ` Robin Dapp
@ 2023-06-29  9:01                 ` juzhe.zhong
  0 siblings, 0 replies; 21+ messages in thread
From: juzhe.zhong @ 2023-06-29  9:01 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, jeffreyalaw, kito.cheng, Kito.cheng,
	palmer, palmer, richard.sandiford
  Cc: Robin Dapp

[-- Attachment #1: Type: text/plain, Size: 1562 bytes --]

Yes, we have no choice since DSE is base on BYTESIZE.

So I walk around in RISC-V backend making VNx1BI, VNx2BI, VNx4BI precision different with VNx8BI to prevent incorrect DSE.
I think such issue can be addressed when we adjust everything using BITSIZE instead of BYTESIZE but it may change to much.
I prefer it to be GCC-15 (such issue can be walk around in RISC-V backend) since we have to much things need to be landed in GCC-14.

Thanks.


juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-06-29 16:53
To: juzhe.zhong@rivai.ai; gcc-patches; jeffreyalaw; kito.cheng; Kito.cheng; palmer; palmer; richard.sandiford
CC: rdapp.gcc
Subject: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
>>> are we absolutely sure this is the only problem we will have
>>> with precision != bitsize and it is confined to the backend?
> Yes.
 
With vinfo.vector_mode == VNx4SI
mask_type = get_mask_type_for_scalar_type (vinfo, int)
mask_type is:
vector(4) <signed-boolean:2>
 
I.e. the precision is 2.  This is definitely fishy and related
to the same problem.  I would almost bet that something in the
middle-end relies on the precision for some optimization but
we just haven't hit it yet.
 
Then we have
vector(2) <signed-boolean:4> (precision 4)
as a mask type for vector(2) long int.
 
Likewise we would likely have a precision of 8 for a vector(1)?
Those might be less severe but still...
 
And that's just what I'm seeing spontaneously after like five
minutes.
 
Regards
Robin
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  8:54             ` Richard Sandiford
@ 2023-06-29  9:09               ` Robin Dapp
  2023-06-29  9:23                 ` juzhe.zhong
  2023-06-29 11:22                 ` Richard Biener
  0 siblings, 2 replies; 21+ messages in thread
From: Robin Dapp @ 2023-06-29  9:09 UTC (permalink / raw)
  To: Robin Dapp via Gcc-patches, 钟居哲,
	Jeff Law, kito.cheng, kito.cheng, palmer, palmer,
	richard.sandiford
  Cc: rdapp.gcc

> Yeah, that part is OK, and was the case I was thinking about when
> I said OK yesterday.  But now that we allow BITSIZE != PRECISION,
> it's possible for BITSIZE - PRECISION to be more than a full byte,
> in which case the new loop would not initialise every byte of
> the mode.

Ah, I see, so when e.g. BITSIZE == 16 and PRECISION == 1.  Luckily
this cannot happen with RVV as all we do is adjust the precision
of the modes that have BITSIZE == 8.  I'm going to add an assert.
Juzhe would rather work around that in the backend, though.

The other thing I just noticed is

tree
build_truth_vector_type_for_mode (poly_uint64 nunits, machine_mode mask_mode)
{
  gcc_assert (mask_mode != BLKmode);

  unsigned HOST_WIDE_INT esize;
  if (VECTOR_MODE_P (mask_mode))
    {
      poly_uint64 vsize = GET_MODE_BITSIZE (mask_mode);
      esize = vector_element_size (vsize, nunits);
    }
  else
    esize = 1;

  tree bool_type = build_nonstandard_boolean_type (esize);

  return make_vector_type (bool_type, nunits, mask_mode);
}

which gives us wrong precision as we rely on the BITSIZE here as well.
This results in a precision of 1 for VNx8BI, 2 for VNx4BI and 4 for
VNx2BI.

Maybe this isn't a problem per se but to me it appears
just wrong.

Regards
 Robin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  9:09               ` Robin Dapp
@ 2023-06-29  9:23                 ` juzhe.zhong
  2023-06-29 11:22                 ` Richard Biener
  1 sibling, 0 replies; 21+ messages in thread
From: juzhe.zhong @ 2023-06-29  9:23 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches, jeffreyalaw, kito.cheng, Kito.cheng,
	palmer, palmer, richard.sandiford
  Cc: Robin Dapp

[-- Attachment #1: Type: text/plain, Size: 2161 bytes --]

No, I am not saying I want to fix it in RISC-V backend.
Actually, if you can quickly land the fix in generic codes and not block of the RISC-V following patches.
I am glad to see. Otherwise, I prefer to fix it RISC-V backend for now if it is not a big issue for performance and defer it to GCC-15 to make it perfect.

The reason why I plan that is global reviewers bandwidth is very limit.
We should make the highest priority auto-vectorizaiton middle-end support first and then let's come back to see the corner case issues.

Thanks.


juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-06-29 17:09
To: Robin Dapp via Gcc-patches; 钟居哲; Jeff Law; kito.cheng; kito.cheng; palmer; palmer; richard.sandiford
CC: rdapp.gcc
Subject: Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
> Yeah, that part is OK, and was the case I was thinking about when
> I said OK yesterday.  But now that we allow BITSIZE != PRECISION,
> it's possible for BITSIZE - PRECISION to be more than a full byte,
> in which case the new loop would not initialise every byte of
> the mode.
 
Ah, I see, so when e.g. BITSIZE == 16 and PRECISION == 1.  Luckily
this cannot happen with RVV as all we do is adjust the precision
of the modes that have BITSIZE == 8.  I'm going to add an assert.
Juzhe would rather work around that in the backend, though.
 
The other thing I just noticed is
 
tree
build_truth_vector_type_for_mode (poly_uint64 nunits, machine_mode mask_mode)
{
  gcc_assert (mask_mode != BLKmode);
 
  unsigned HOST_WIDE_INT esize;
  if (VECTOR_MODE_P (mask_mode))
    {
      poly_uint64 vsize = GET_MODE_BITSIZE (mask_mode);
      esize = vector_element_size (vsize, nunits);
    }
  else
    esize = 1;
 
  tree bool_type = build_nonstandard_boolean_type (esize);
 
  return make_vector_type (bool_type, nunits, mask_mode);
}
 
which gives us wrong precision as we rely on the BITSIZE here as well.
This results in a precision of 1 for VNx8BI, 2 for VNx4BI and 4 for
VNx2BI.
 
Maybe this isn't a problem per se but to me it appears
just wrong.
 
Regards
Robin
 
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29  9:09               ` Robin Dapp
  2023-06-29  9:23                 ` juzhe.zhong
@ 2023-06-29 11:22                 ` Richard Biener
  2023-06-29 11:38                   ` Robin Dapp
  1 sibling, 1 reply; 21+ messages in thread
From: Richard Biener @ 2023-06-29 11:22 UTC (permalink / raw)
  To: Robin Dapp
  Cc: Robin Dapp via Gcc-patches, 钟居哲,
	Jeff Law, kito.cheng, kito.cheng, palmer, palmer,
	richard.sandiford

On Thu, Jun 29, 2023 at 11:10 AM Robin Dapp via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> > Yeah, that part is OK, and was the case I was thinking about when
> > I said OK yesterday.  But now that we allow BITSIZE != PRECISION,
> > it's possible for BITSIZE - PRECISION to be more than a full byte,
> > in which case the new loop would not initialise every byte of
> > the mode.
>
> Ah, I see, so when e.g. BITSIZE == 16 and PRECISION == 1.  Luckily
> this cannot happen with RVV as all we do is adjust the precision
> of the modes that have BITSIZE == 8.  I'm going to add an assert.
> Juzhe would rather work around that in the backend, though.
>
> The other thing I just noticed is
>
> tree
> build_truth_vector_type_for_mode (poly_uint64 nunits, machine_mode mask_mode)
> {
>   gcc_assert (mask_mode != BLKmode);
>
>   unsigned HOST_WIDE_INT esize;
>   if (VECTOR_MODE_P (mask_mode))
>     {
>       poly_uint64 vsize = GET_MODE_BITSIZE (mask_mode);
>       esize = vector_element_size (vsize, nunits);
>     }
>   else
>     esize = 1;
>
>   tree bool_type = build_nonstandard_boolean_type (esize);
>
>   return make_vector_type (bool_type, nunits, mask_mode);
> }
>
> which gives us wrong precision as we rely on the BITSIZE here as well.
> This results in a precision of 1 for VNx8BI, 2 for VNx4BI and 4 for
> VNx2BI.

This should probably use GET_MODE_PRECISION as well.

OK if it bootstraps/tests on both aarch64 and riscv.

Richard.

>
> Maybe this isn't a problem per se but to me it appears
> just wrong.
>
> Regards
>  Robin
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29 11:22                 ` Richard Biener
@ 2023-06-29 11:38                   ` Robin Dapp
  2023-06-29 13:53                     ` Kito Cheng
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Dapp @ 2023-06-29 11:38 UTC (permalink / raw)
  To: Richard Biener
  Cc: rdapp.gcc, Robin Dapp via Gcc-patches, 钟居哲,
	Jeff Law, kito.cheng, kito.cheng, palmer, palmer,
	richard.sandiford

> This should probably use GET_MODE_PRECISION as well.
> 
> OK if it bootstraps/tests on both aarch64 and riscv.
> 
> Richard.

I found a several other instances, also in the frontends that
I'm not exactly sure about.  I'm currently testing this but aarch64
bootstrap is still going to take a while, various aarch compile
farm machines are down?

Regards
 Robin

From ef919a27f4a156afeca6b4825e6029d9f44be556 Mon Sep 17 00:00:00 2001
From: Robin Dapp <rdapp@ventanamicro.com>
Date: Wed, 28 Jun 2023 20:59:29 +0200
Subject: [PATCH] mode_bitsize -> precision.

bitsize -> precision.
---
 gcc/c-family/c-common.cc      |  2 +-
 gcc/fortran/trans-types.cc    |  2 +-
 gcc/go/go-lang.cc             |  2 +-
 gcc/lto/lto-lang.cc           |  2 +-
 gcc/rust/backend/rust-tree.cc |  2 +-
 gcc/simplify-rtx.cc           | 10 +++++-----
 gcc/tree.cc                   |  2 +-
 gcc/varasm.cc                 |  2 +-
 8 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 34566a342bd..6ab63dae997 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -2458,7 +2458,7 @@ c_common_type_for_mode (machine_mode mode, int unsignedp)
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/fortran/trans-types.cc b/gcc/fortran/trans-types.cc
index d718f28cc86..987e3d26c46 100644
--- a/gcc/fortran/trans-types.cc
+++ b/gcc/fortran/trans-types.cc
@@ -3403,7 +3403,7 @@ gfc_type_for_mode (machine_mode mode, int unsignedp)
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/go/go-lang.cc b/gcc/go/go-lang.cc
index e85a4bfe949..d5c871a533c 100644
--- a/gcc/go/go-lang.cc
+++ b/gcc/go/go-lang.cc
@@ -414,7 +414,7 @@ go_langhook_type_for_mode (machine_mode mode, int unsignedp)
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
       && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc
index 52d7626e92e..14d419c2013 100644
--- a/gcc/lto/lto-lang.cc
+++ b/gcc/lto/lto-lang.cc
@@ -1050,7 +1050,7 @@ lto_type_for_mode (machine_mode mode, int unsigned_p)
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vectwhereor_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/rust/backend/rust-tree.cc b/gcc/rust/backend/rust-tree.cc
index 8243d4cf5c6..66e859cd70c 100644
--- a/gcc/rust/backend/rust-tree.cc
+++ b/gcc/rust/backend/rust-tree.cc
@@ -5320,7 +5320,7 @@ c_common_type_for_mode (machine_mode mode, int unsignedp)
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
       unsigned int elem_bits
-	= vector_element_size (GET_MODE_BITSIZE (mode), GET_MODE_NUNITS (mode));
+	= vector_element_size (GET_MODE_PRECISION (mode), GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
     }
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 99cbdd47d93..d7315d82aa3 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -7076,7 +7076,7 @@ native_encode_rtx (machine_mode mode, rtx x, vec<target_unit> &bytes,
       /* CONST_VECTOR_ELT follows target memory order, so no shuffling
 	 is necessary.  The only complication is that MODE_VECTOR_BOOL
 	 vectors can have several elements per byte.  */
-      unsigned int elt_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						   GET_MODE_NUNITS (mode));
       unsigned int elt = first_byte * BITS_PER_UNIT / elt_bits;
       if (elt_bits < BITS_PER_UNIT)
@@ -7222,7 +7222,7 @@ native_decode_vector_rtx (machine_mode mode, const vec<target_unit> &bytes,
 {
   rtx_vector_builder builder (mode, npatterns, nelts_per_pattern);
 
-  unsigned int elt_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+  unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
 					       GET_MODE_NUNITS (mode));
   if (elt_bits < BITS_PER_UNIT)
     {
@@ -7359,7 +7359,7 @@ simplify_const_vector_byte_offset (rtx x, poly_uint64 byte)
 {
   /* Cope with MODE_VECTOR_BOOL by operating on bits rather than bytes.  */
   machine_mode mode = GET_MODE (x);
-  unsigned int elt_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+  unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
 					       GET_MODE_NUNITS (mode));
   /* The number of bits needed to encode one element from each pattern.  */
   unsigned int sequence_bits = CONST_VECTOR_NPATTERNS (x) * elt_bits;
@@ -7414,10 +7414,10 @@ simplify_const_vector_subreg (machine_mode outermode, rtx x,
 
   /* Cope with MODE_VECTOR_BOOL by operating on bits rather than bytes.  */
   unsigned int x_elt_bits
-    = vector_element_size (GET_MODE_BITSIZE (innermode),
+    = vector_element_size (GET_MODE_PRECISION (innermode),
 			   GET_MODE_NUNITS (innermode));
   unsigned int out_elt_bits
-    = vector_element_size (GET_MODE_BITSIZE (outermode),
+    = vector_element_size (GET_MODE_PRECISION (outermode),
 			   GET_MODE_NUNITS (outermode));
 
   /* The number of bits needed to encode one element from every pattern
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 58288efa2e2..c68761fccee 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -10143,7 +10143,7 @@ build_truth_vector_type_for_mode (poly_uint64 nunits, machine_mode mask_mode)
   unsigned HOST_WIDE_INT esize;
   if (VECTOR_MODE_P (mask_mode))
     {
-      poly_uint64 vsize = GET_MODE_BITSIZE (mask_mode);
+      poly_uint64 vsize = GET_MODE_PRECISION (mask_mode);
       esize = vector_element_size (vsize, nunits);
     }
   else
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 8ae0a2555cd..f65416cff99 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -4061,7 +4061,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
 	   whole element.  Often this is byte_mode and contains more
 	   than one element.  */
 	unsigned int nelts = GET_MODE_NUNITS (mode);
-	unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
+	unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts;
 	unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
 	scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require ();
 	unsigned int mask = GET_MODE_MASK (GET_MODE_INNER (mode));
-- 
2.41.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29 11:38                   ` Robin Dapp
@ 2023-06-29 13:53                     ` Kito Cheng
  2023-06-29 14:04                       ` Richard Sandiford
  0 siblings, 1 reply; 21+ messages in thread
From: Kito Cheng @ 2023-06-29 13:53 UTC (permalink / raw)
  To: Robin Dapp
  Cc: Richard Biener, Robin Dapp via Gcc-patches,
	钟居哲,
	Jeff Law, kito.cheng, palmer, palmer, richard.sandiford

Hi Robin:

> diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc
> index 52d7626e92e..14d419c2013 100644
> --- a/gcc/lto/lto-lang.cc
> +++ b/gcc/lto/lto-lang.cc
> @@ -1050,7 +1050,7 @@ lto_type_for_mode (machine_mode mode, int unsigned_p)
>    else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
>            && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
>      {
> -      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
> +      unsigned int elem_bits = vectwhereor_element_size (GET_MODE_PRECISION (mode),

This seems weird?

>                                                     GET_MODE_NUNITS (mode));
>        tree bool_type = build_nonstandard_boolean_type (elem_bits);
>        return build_vector_type_for_mode (bool_type, mode);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29 13:53                     ` Kito Cheng
@ 2023-06-29 14:04                       ` Richard Sandiford
  2023-06-29 14:12                         ` Robin Dapp
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Sandiford @ 2023-06-29 14:04 UTC (permalink / raw)
  To: Kito Cheng
  Cc: Robin Dapp, Richard Biener, Robin Dapp via Gcc-patches,
	钟居哲,
	Jeff Law, kito.cheng, palmer, palmer

Kito Cheng <kito.cheng@gmail.com> writes:
> Hi Robin:
>
>> diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc
>> index 52d7626e92e..14d419c2013 100644
>> --- a/gcc/lto/lto-lang.cc
>> +++ b/gcc/lto/lto-lang.cc
>> @@ -1050,7 +1050,7 @@ lto_type_for_mode (machine_mode mode, int unsigned_p)
>>    else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
>>            && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
>>      {
>> -      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
>> +      unsigned int elem_bits = vectwhereor_element_size (GET_MODE_PRECISION (mode),
>
> This seems weird?

FWIW, I bootstrapped & regression-tested the patch with that fixed
on aarch64-linux-gnu (all languages).

So OK with the above fixed from my POV.

Thanks,
Richard

>
>>                                                     GET_MODE_NUNITS (mode));
>>        tree bool_type = build_nonstandard_boolean_type (elem_bits);
>>        return build_vector_type_for_mode (bool_type, mode);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29 14:04                       ` Richard Sandiford
@ 2023-06-29 14:12                         ` Robin Dapp
  2023-07-04 19:07                           ` Robin Dapp
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Dapp @ 2023-06-29 14:12 UTC (permalink / raw)
  To: Kito Cheng, Richard Biener, Robin Dapp via Gcc-patches,
	钟居哲,
	Jeff Law, kito.cheng, palmer, palmer, richard.sandiford
  Cc: rdapp.gcc

>> Hi Robin:
>>
>>> diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc
>>> index 52d7626e92e..14d419c2013 100644
>>> --- a/gcc/lto/lto-lang.cc
>>> +++ b/gcc/lto/lto-lang.cc
>>> @@ -1050,7 +1050,7 @@ lto_type_for_mode (machine_mode mode, int unsigned_p)
>>>    else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
>>>            && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
>>>      {
>>> -      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
>>> +      unsigned int elem_bits = vectwhereor_element_size (GET_MODE_PRECISION (mode),
>>
>> This seems weird?

Indeed :D Must be an accidental middle-click in Thunderbird.  I just
re-checked and the diff itself is fine.

> FWIW, I bootstrapped & regression-tested the patch with that fixed
> on aarch64-linux-gnu (all languages).
> 
> So OK with the above fixed from my POV.
Oh, thanks!  Mine is still running, not even with all languages.  I picked
the M1 from the compile farm which only has eight cores.

Kito (or somebody else), would you mind doing a RISC-V bootstrap?  It would
take forever on my machine.  Thank you.

Regards
 Robin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI
  2023-06-29 14:12                         ` Robin Dapp
@ 2023-07-04 19:07                           ` Robin Dapp
  0 siblings, 0 replies; 21+ messages in thread
From: Robin Dapp @ 2023-07-04 19:07 UTC (permalink / raw)
  To: Kito Cheng, Richard Biener, Robin Dapp via Gcc-patches,
	钟居哲,
	Jeff Law, kito.cheng, palmer, palmer, richard.sandiford
  Cc: rdapp.gcc

> Kito (or somebody else), would you mind doing a RISC-V bootstrap?  It would
> take forever on my machine.  Thank you.
I did a bootstrap myself now and it finally finished.  Going to commit the
attached tomorrow.

Regards
 Robin

Subject: [PATCH] Change MODE_BITSIZE to MODE_PRECISION for MODE_VECTOR_BOOL.

RISC-V lowers the TYPE_PRECISION for MODE_VECTOR_BOOL vectors in order
to distinguish between VNx1BI, VNx2BI, VNx4BI and VNx8BI.

This patch adjusts uses of MODE_VECTOR_BOOL to use GET_MODE_PRECISION
instead of GET_MODE_BITSIZE.

The RISC-V tests are provided by Juzhe.

Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>

gcc/c-family/ChangeLog:

	* c-common.cc (c_common_type_for_mode): Use GET_MODE_PRECISION.

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (expand_const_vector): Ditto.
	* simplify-rtx.cc (native_encode_rtx): Ditto.
	(native_decode_vector_rtx): Ditto.
	(simplify_const_vector_byte_offset): Ditto.
	(simplify_const_vector_subreg): Ditto.
	* tree.cc (build_truth_vector_type_for_mode): Ditto.
	* varasm.cc (output_constant_pool_2): Ditto.

gcc/fortran/ChangeLog:

	* trans-types.cc (gfc_type_for_mode): Ditto.

gcc/go/ChangeLog:

	* go-lang.cc (go_langhook_type_for_mode): Ditto.

gcc/lto/ChangeLog:

	* lto-lang.cc (lto_type_for_mode): Ditto.

gcc/rust/ChangeLog:

	* backend/rust-tree.cc (c_common_type_for_mode): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c: New test.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c: New test.
---
 gcc/c-family/c-common.cc                      |  2 +-
 gcc/config/riscv/riscv-v.cc                   | 12 ++++----
 gcc/fortran/trans-types.cc                    |  2 +-
 gcc/go/go-lang.cc                             |  2 +-
 gcc/lto/lto-lang.cc                           |  2 +-
 gcc/rust/backend/rust-tree.cc                 |  2 +-
 gcc/simplify-rtx.cc                           | 10 +++----
 .../riscv/rvv/autovec/vls-vlmax/bitmask-1.c   | 23 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-10.c  | 22 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-11.c  | 23 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-12.c  | 23 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-13.c  | 23 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-14.c  | 24 +++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-2.c   | 23 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-3.c   | 23 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-4.c   | 23 ++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-5.c   | 25 ++++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-6.c   | 27 +++++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-7.c   | 30 +++++++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-8.c   | 30 +++++++++++++++++++
 .../riscv/rvv/autovec/vls-vlmax/bitmask-9.c   | 30 +++++++++++++++++++
 gcc/tree.cc                                   |  2 +-
 gcc/varasm.cc                                 |  8 ++++-
 23 files changed, 374 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 34566a342bd..6ab63dae997 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -2458,7 +2458,7 @@ c_common_type_for_mode (machine_mode mode, int unsignedp)
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 3f9ee044e8e..0595e5726a7 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1141,11 +1141,13 @@ expand_const_vector (rtx target, rtx src)
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
     {
       rtx elt;
-      gcc_assert (
-	const_vec_duplicate_p (src, &elt)
-	&& (rtx_equal_p (elt, const0_rtx) || rtx_equal_p (elt, const1_rtx)));
-      rtx ops[] = {target, src};
-      emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
+      if (const_vec_duplicate_p (src, &elt))
+	{
+	  rtx ops[] = {target, src};
+	  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
+	}
+      else
+	gcc_unreachable ();
       return;
     }
 
diff --git a/gcc/fortran/trans-types.cc b/gcc/fortran/trans-types.cc
index d718f28cc86..987e3d26c46 100644
--- a/gcc/fortran/trans-types.cc
+++ b/gcc/fortran/trans-types.cc
@@ -3403,7 +3403,7 @@ gfc_type_for_mode (machine_mode mode, int unsignedp)
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/go/go-lang.cc b/gcc/go/go-lang.cc
index e85a4bfe949..d5c871a533c 100644
--- a/gcc/go/go-lang.cc
+++ b/gcc/go/go-lang.cc
@@ -414,7 +414,7 @@ go_langhook_type_for_mode (machine_mode mode, int unsignedp)
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
       && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc
index 52d7626e92e..14d419c2013 100644
--- a/gcc/lto/lto-lang.cc
+++ b/gcc/lto/lto-lang.cc
@@ -1050,7 +1050,7 @@ lto_type_for_mode (machine_mode mode, int unsigned_p)
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
-      unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elem_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						    GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
diff --git a/gcc/rust/backend/rust-tree.cc b/gcc/rust/backend/rust-tree.cc
index 8243d4cf5c6..66e859cd70c 100644
--- a/gcc/rust/backend/rust-tree.cc
+++ b/gcc/rust/backend/rust-tree.cc
@@ -5320,7 +5320,7 @@ c_common_type_for_mode (machine_mode mode, int unsignedp)
 	   && valid_vector_subparts_p (GET_MODE_NUNITS (mode)))
     {
       unsigned int elem_bits
-	= vector_element_size (GET_MODE_BITSIZE (mode), GET_MODE_NUNITS (mode));
+	= vector_element_size (GET_MODE_PRECISION (mode), GET_MODE_NUNITS (mode));
       tree bool_type = build_nonstandard_boolean_type (elem_bits);
       return build_vector_type_for_mode (bool_type, mode);
     }
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 99cbdd47d93..d7315d82aa3 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -7076,7 +7076,7 @@ native_encode_rtx (machine_mode mode, rtx x, vec<target_unit> &bytes,
       /* CONST_VECTOR_ELT follows target memory order, so no shuffling
 	 is necessary.  The only complication is that MODE_VECTOR_BOOL
 	 vectors can have several elements per byte.  */
-      unsigned int elt_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+      unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
 						   GET_MODE_NUNITS (mode));
       unsigned int elt = first_byte * BITS_PER_UNIT / elt_bits;
       if (elt_bits < BITS_PER_UNIT)
@@ -7222,7 +7222,7 @@ native_decode_vector_rtx (machine_mode mode, const vec<target_unit> &bytes,
 {
   rtx_vector_builder builder (mode, npatterns, nelts_per_pattern);
 
-  unsigned int elt_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+  unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
 					       GET_MODE_NUNITS (mode));
   if (elt_bits < BITS_PER_UNIT)
     {
@@ -7359,7 +7359,7 @@ simplify_const_vector_byte_offset (rtx x, poly_uint64 byte)
 {
   /* Cope with MODE_VECTOR_BOOL by operating on bits rather than bytes.  */
   machine_mode mode = GET_MODE (x);
-  unsigned int elt_bits = vector_element_size (GET_MODE_BITSIZE (mode),
+  unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
 					       GET_MODE_NUNITS (mode));
   /* The number of bits needed to encode one element from each pattern.  */
   unsigned int sequence_bits = CONST_VECTOR_NPATTERNS (x) * elt_bits;
@@ -7414,10 +7414,10 @@ simplify_const_vector_subreg (machine_mode outermode, rtx x,
 
   /* Cope with MODE_VECTOR_BOOL by operating on bits rather than bytes.  */
   unsigned int x_elt_bits
-    = vector_element_size (GET_MODE_BITSIZE (innermode),
+    = vector_element_size (GET_MODE_PRECISION (innermode),
 			   GET_MODE_NUNITS (innermode));
   unsigned int out_elt_bits
-    = vector_element_size (GET_MODE_BITSIZE (outermode),
+    = vector_element_size (GET_MODE_PRECISION (outermode),
 			   GET_MODE_NUNITS (outermode));
 
   /* The number of bits needed to encode one element from every pattern
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c
new file mode 100644
index 00000000000..81229fd62b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int64_t out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c
new file mode 100644
index 00000000000..d891f3c16e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c
@@ -0,0 +1,22 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m2" } */
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c
new file mode 100644
index 00000000000..535641443ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 4
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c
new file mode 100644
index 00000000000..a7c12c3797b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m2" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 8
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c
new file mode 100644
index 00000000000..726238c1cd8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m4" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c
new file mode 100644
index 00000000000..c369cf0b268
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c
@@ -0,0 +1,24 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3 --param riscv-autovec-lmul=m8" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 32
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c
new file mode 100644
index 00000000000..a23e47171bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c
new file mode 100644
index 00000000000..6ea8fdd89c0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int16_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int16_t out[N] = {0};
+  for (int16_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int16_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c
new file mode 100644
index 00000000000..2d97c26abfd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c
@@ -0,0 +1,23 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+#define N 16
+
+int
+main ()
+{
+  int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c
new file mode 100644
index 00000000000..b89b70e99a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c
@@ -0,0 +1,25 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m2 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 32
+
+int
+main ()
+{
+  int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c
new file mode 100644
index 00000000000..ac8d91e793b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m4 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 64
+
+int
+main ()
+{
+  int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+		    0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int8_t out[N] = {0};
+  for (int8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c
new file mode 100644
index 00000000000..f538db23b1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c
@@ -0,0 +1,30 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 128
+
+int
+main ()
+{
+  uint8_t mask[N]
+    = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  uint8_t out[N] = {0};
+  for (uint8_t i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (uint8_t i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c
new file mode 100644
index 00000000000..5abb34c1686
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c
@@ -0,0 +1,30 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 128
+
+int
+main ()
+{
+  int mask[N]
+    = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c
new file mode 100644
index 00000000000..6fdaa516534
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c
@@ -0,0 +1,30 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -O3" } */
+
+#include <stdint-gcc.h>
+#include <assert.h>
+
+#define N 128
+
+int
+main ()
+{
+  int64_t mask[N]
+    = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
+       0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
+  int64_t out[N] = {0};
+  for (int i = 0; i < N; ++i)
+    if (mask[i])
+      out[i] = i;
+  for (int i = 0; i < N; ++i)
+    {
+      if (mask[i])
+	assert (out[i] == i);
+      else
+	assert (out[i] == 0);
+    }
+}
diff --git a/gcc/tree.cc b/gcc/tree.cc
index bd500ec72a5..420857b110c 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -10143,7 +10143,7 @@ build_truth_vector_type_for_mode (poly_uint64 nunits, machine_mode mask_mode)
   unsigned HOST_WIDE_INT esize;
   if (VECTOR_MODE_P (mask_mode))
     {
-      poly_uint64 vsize = GET_MODE_BITSIZE (mask_mode);
+      poly_uint64 vsize = GET_MODE_PRECISION (mask_mode);
       esize = vector_element_size (vsize, nunits);
     }
   else
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 8ae0a2555cd..53f0cc61922 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -4061,11 +4061,17 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
 	   whole element.  Often this is byte_mode and contains more
 	   than one element.  */
 	unsigned int nelts = GET_MODE_NUNITS (mode);
-	unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
+	unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts;
 	unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
 	scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require ();
 	unsigned int mask = GET_MODE_MASK (GET_MODE_INNER (mode));
 
+	/* We allow GET_MODE_PRECISION (mode) <= GET_MODE_BITSIZE (mode) but
+	   only properly handle cases where the difference is less than a
+	   byte.  */
+	gcc_assert (GET_MODE_BITSIZE (mode) - GET_MODE_PRECISION (mode) <
+		    BITS_PER_UNIT);
+
 	/* Build the constant up one integer at a time.  */
 	unsigned int elts_per_int = int_bits / elt_bits;
 	for (unsigned int i = 0; i < nelts; i += elts_per_int)
-- 
2.41.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-07-04 19:07 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28  9:47 [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI Juzhe-Zhong
2023-06-28 18:11 ` Jeff Law
2023-06-28 19:02   ` 钟居哲
2023-06-28 19:12     ` Robin Dapp
2023-06-28 20:42       ` Richard Sandiford
2023-06-28 21:46         ` 钟居哲
2023-06-29  7:53         ` Richard Sandiford
2023-06-29  8:08           ` juzhe.zhong
2023-06-29  8:14           ` Robin Dapp
2023-06-29  8:18             ` juzhe.zhong
2023-06-29  8:53               ` Robin Dapp
2023-06-29  9:01                 ` juzhe.zhong
2023-06-29  8:54             ` Richard Sandiford
2023-06-29  9:09               ` Robin Dapp
2023-06-29  9:23                 ` juzhe.zhong
2023-06-29 11:22                 ` Richard Biener
2023-06-29 11:38                   ` Robin Dapp
2023-06-29 13:53                     ` Kito Cheng
2023-06-29 14:04                       ` Richard Sandiford
2023-06-29 14:12                         ` Robin Dapp
2023-07-04 19:07                           ` Robin Dapp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).