public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 00/13] ARM/MVE use vectors of boolean for predicates
@ 2021-09-07  9:15 Christophe Lyon
  2021-09-07  9:15 ` [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE Christophe Lyon
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Christophe Lyon @ 2021-09-07  9:15 UTC (permalink / raw)
  To: gcc-patches

This patch series addresses PR 100757 and 101325 by representing
vectors of predicates (MVE VPR.P0 register) as vectors of booleans
rather than using HImode.

As this implies a lot of mostly mechanical changes, I have tried to
split the patches in a way that should help reviewers, but the split
is a bit artificial.

Patches 1-3 add new tests.

Patches 4-6 are small independent improvements.

Patch 7 implements the predicate qualifier, but does not change any
builtin yet.

Patch 8 is the first of the two main patches, and uses the new
qualifier to describe the vcmp and vpsel builtins that are useful for
auto-vectorization of comparisons.

Patch 9 is the second main patch, which fixes the vcond_mask expander.

Patches 10-13 convert almost all the remaining builtins with HI
operands to use the predicate qualifier.  After these, there are still
a few builtins with HI operands left, about which I am not sure: vctp,
vpnot, load-gather and store-scatter with v2di operands.  In fact,
patches 11/12 update some STR/LDR qualifiers in a way that breaks
these v2di builtins although existing tests still pass.

Christophe Lyon (13):
  arm: Add new tests for comparison vectorization with Neon and MVE
  arm: Add tests for PR target/100757
  arm: Add test for PR target/101325
  arm: Add GENERAL_AND_VPR_REGS regclass
  arm: Add support for VPR_REG in arm_class_likely_spilled_p
  arm: Fix mve_vmvnq_n_<supf><mode> argument mode
  arm: Implement MVE predicates as vectors of booleans
  arm: Implement auto-vectorized MVE comparisons with vectors of boolean
    predicates
  arm: Fix vcond_mask expander for MVE (PR target/100757)
  arm: Convert remaining MVE vcmp builtins to predicate qualifiers
  arm: Convert more MVE builtins to predicate qualifiers
  arm: Convert more load/store MVE builtins to predicate qualifiers
  arm: Convert more MVE/CDE builtins to predicate qualifiers

 gcc/config/arm/arm-builtins.c                 | 228 +++--
 gcc/config/arm/arm-modes.def                  |   5 +
 gcc/config/arm/arm-protos.h                   |   3 +-
 gcc/config/arm/arm-simd-builtin-types.def     |   4 +
 gcc/config/arm/arm.c                          | 128 ++-
 gcc/config/arm/arm.h                          |   5 +-
 gcc/config/arm/arm_mve_builtins.def           | 746 ++++++++--------
 gcc/config/arm/iterators.md                   |   5 +
 gcc/config/arm/mve.md                         | 823 ++++++++++--------
 gcc/config/arm/neon.md                        |  39 +
 gcc/config/arm/vec-common.md                  |  52 --
 gcc/simplify-rtx.c                            |   7 +
 .../arm/acle/cde-mve-full-assembly.c          | 264 +++---
 .../gcc.target/arm/simd/mve-vcmp-f32-2.c      |  32 +
 .../gcc.target/arm/simd/neon-compare-1.c      |  78 ++
 .../gcc.target/arm/simd/neon-compare-2.c      |  13 +
 .../gcc.target/arm/simd/neon-compare-3.c      |  14 +
 .../arm/simd/neon-compare-scalar-1.c          |  57 ++
 .../gcc.target/arm/simd/neon-vcmp-f16.c       |  12 +
 .../gcc.target/arm/simd/neon-vcmp-f32-2.c     |  15 +
 .../gcc.target/arm/simd/neon-vcmp-f32-3.c     |  12 +
 .../gcc.target/arm/simd/neon-vcmp-f32.c       |  12 +
 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c |  22 +
 .../gcc.target/arm/simd/pr100757-2.c          |  20 +
 .../gcc.target/arm/simd/pr100757-3.c          |  20 +
 .../gcc.target/arm/simd/pr100757-4.c          |  19 +
 gcc/testsuite/gcc.target/arm/simd/pr100757.c  |  19 +
 gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  14 +
 28 files changed, 1581 insertions(+), 1087 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE
  2021-09-07  9:15 [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe Lyon
@ 2021-09-07  9:15 ` Christophe Lyon
  2021-09-28 11:11   ` Kyrylo Tkachov
  2021-09-07  9:15 ` [PATCH 02/13] arm: Add tests for PR target/100757 Christophe Lyon
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Christophe Lyon @ 2021-09-07  9:15 UTC (permalink / raw)
  To: gcc-patches

This patch mainly adds Neon tests similar to existing MVE ones,
to make sure we do not break Neon when fixing MVE.

mve-vcmp-f32-2.c is similar to mve-vcmp-f32.c but uses a conditional
with 2.0f and 3.0f constants to help scan-assembler-times.

2021-09-01  Christophe Lyon <christophe.lyon@foss.st.com>

	gcc/testsuite/
	* gcc.target/arm/simd/mve-vcmp-f32-2.c: New.
	* gcc.target/arm/simd/neon-compare-1.c: New.
	* gcc.target/arm/simd/neon-compare-2.c: New.
	* gcc.target/arm/simd/neon-compare-3.c: New.
	* gcc.target/arm/simd/neon-compare-scalar-1.c: New.
	* gcc.target/arm/simd/neon-vcmp-f16.c: New.
	* gcc.target/arm/simd/neon-vcmp-f32-2.c: New.
	* gcc.target/arm/simd/neon-vcmp-f32-3.c: New.
	* gcc.target/arm/simd/neon-vcmp-f32.c: New.
	* gcc.target/arm/simd/neon-vcmp.c: New.

diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
new file mode 100644
index 00000000000..917a95bf141
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
@@ -0,0 +1,32 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+
+#include <stdint.h>
+
+#define NB 4
+
+#define FUNC(OP, NAME)							\
+  void test_ ## NAME ##_f (float * __restrict__ dest, float *a, float *b) { \
+    int i;								\
+    for (i=0; i<NB; i++) {						\
+      dest[i] = (a[i] OP b[i]) ? 2.0f : 3.0f;				\
+    }									\
+  }
+
+FUNC(==, vcmpeq)
+FUNC(!=, vcmpne)
+FUNC(<, vcmplt)
+FUNC(<=, vcmple)
+FUNC(>, vcmpgt)
+FUNC(>=, vcmpge)
+
+/* { dg-final { scan-assembler-times {\tvcmp.f32\teq, q[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tne, q[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tlt, q[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tle, q[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tgt, q[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tge, q[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 24 } } */ /* Constant 2.0f.  */
+/* { dg-final { scan-assembler-times {\t.word\t1077936128\n} 24 } } */ /* Constant 3.0f.  */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c b/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
new file mode 100644
index 00000000000..2e0222a71f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
@@ -0,0 +1,78 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-add-options arm_neon } */
+/* { dg-additional-options "-O3" } */
+
+#include "mve-compare-1.c"
+
+/* 64-bit vectors.  */
+/* vmvn is used by 'ne' comparisons: 3 sizes * 2 (signed/unsigned) * 2
+   (register/zero) = 12.  */
+/* { dg-final { scan-assembler-times {\tvmvn\td[0-9]+, d[0-9]+\n} 12 } } */
+
+/* { 8 bits } x { eq, ne, lt, le, gt, ge }. */
+/* ne uses eq, lt/le only apply to comparison with zero, they use gt/ge
+   otherwise.  */
+/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, d[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, #0\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvclt.s8\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcle.s8\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+
+/* { 16 bits } x { eq, ne, lt, le, gt, ge }. */
+/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, d[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, #0\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvclt.s16\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcle.s16\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s16\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s16\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s16\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s16\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+
+/* { 32 bits } x { eq, ne, lt, le, gt, ge }. */
+/* { dg-final { scan-assembler-times {\tvceq.i32\td[0-9]+, d[0-9]+, d[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvceq.i32\td[0-9]+, d[0-9]+, #0\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvclt.s32\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcle.s32\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s32\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s32\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s32\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s32\td[0-9]+, d[0-9]+, #0\n} 1 } } */
+
+/* 128-bit vectors.  */
+
+/* vmvn is used by 'ne' comparisons.  */
+/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 12 } } */
+
+/* { 8 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvceq.i8\tq[0-9]+, q[0-9]+, #0\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvclt.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcle.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+
+/* { 16 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvceq.i16\tq[0-9]+, q[0-9]+, #0\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvclt.s16\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcle.s16\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s16\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s16\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+
+/* { 32 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvceq.i32\tq[0-9]+, q[0-9]+, #0\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvclt.s32\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcle.s32\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s32\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s32\tq[0-9]+, q[0-9]+, #0\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c b/gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
new file mode 100644
index 00000000000..06f3c14c91e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-add-options arm_neon } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+
+#include "mve-compare-2.c"
+
+/* eq, ne, lt, le, gt, ge.  */
+/* ne uses eq+vmvn, lt/le use gt/ge with swapped operands.  */
+/* { dg-final { scan-assembler-times {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c b/gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
new file mode 100644
index 00000000000..9c9f108843b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */
+/* { dg-add-options arm_v8_2a_fp16_neon } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+
+#include "mve-compare-3.c"
+
+
+/* eq, ne, lt, le, gt, ge.  */
+/* ne uses eq+vmvn, lt/le use gt/ge with swapped operands.  */
+/* { dg-final { scan-assembler-times {\tvceq.f16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.f16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.f16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c b/gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
new file mode 100644
index 00000000000..0783624a3f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
@@ -0,0 +1,57 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-add-options arm_neon } */
+/* { dg-additional-options "-O3" } */
+
+#include "mve-compare-scalar-1.c"
+
+/* 64-bit vectors.  */
+/* vmvn is used by 'ne' comparisons.  */
+/* { dg-final { scan-assembler-times {\tvmvn\td[0-9]+, d[0-9]+\n} 6 } } */
+
+/* { 8 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, d[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u8\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u8\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+
+/* { 16 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, d[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s16\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u16\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s16\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u16\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+
+/* { 32 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i32\td[0-9]+, d[0-9]+, d[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s32\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u32\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s32\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u32\td[0-9]+, d[0-9]+, d[0-9]+\n} 2 } } */
+
+/* 128-bit vectors.  */
+
+/* vmvn is used by 'ne' comparisons.  */
+/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 6 } } */
+
+/* { 8 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u8\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+
+/* { 16 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+
+/* { 32 bits } x { eq, ne, lt, le, gt, ge }.  */
+/* { dg-final { scan-assembler-times {\tvceq.i32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
new file mode 100644
index 00000000000..688fd9a235f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */
+/* { dg-add-options arm_v8_2a_fp16_neon } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+
+#include "mve-vcmp-f16.c"
+
+/* 'ne' uses vceq.  */
+/* le and lt use ge and gt with inverted operands.  */
+/* { dg-final { scan-assembler-times {\tvceq.f16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.f16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.f16\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
new file mode 100644
index 00000000000..a22923eb242
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-add-options arm_neon } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+
+#include "mve-vcmp-f32-2.c"
+
+/* 'ne' uses vceq.  */
+/* le and lt use ge and gt with inverted operands.  */
+/* { dg-final { scan-assembler-times {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tvmov.f32\tq[0-9]+, #2.0e\+0} 6 } } */
+/* { dg-final { scan-assembler-times {\tvmov.f32\tq[0-9]+, #3.0e\+0} 6 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
new file mode 100644
index 00000000000..4f12f043d3a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-add-options arm_neon } */
+/* { dg-additional-options "-O3" } */
+
+#include "mve-vcmp-f32.c"
+
+/* Should not be vectorized, since we do not use -funsafe-math-optimizations.  */
+
+/* { dg-final { scan-assembler-not {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} } } */
+/* { dg-final { scan-assembler-not {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} } } */
+/* { dg-final { scan-assembler-not {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
new file mode 100644
index 00000000000..06e5c4fd1d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-add-options arm_neon } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+
+#include "mve-vcmp-f32.c"
+
+/* 'ne' uses vceq.  */
+/* le and lt use ge and gt with inverted operands.  */
+/* { dg-final { scan-assembler-times {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-9]+\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
new file mode 100644
index 00000000000..f2b92b1be7f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-add-options arm_neon } */
+/* { dg-additional-options "-O3" } */
+
+#include "mve-vcmp.c"
+
+/* vceq is also used for 'ne' comparisons.  */
+/* { dg-final { scan-assembler-times {\tvceq.i[0-9]+\td[0-9]+, d[0-9]+, d[0-9]+\n} 12 } } */
+/* { dg-final { scan-assembler-times {\tvceq.i[0-9]+\tq[0-9]+, q[0-9]+, q[0-9]+\n} 12 } } */
+
+/* lt and le are replaced with the opposite condition, hence the double number
+   of matches for gt and ge.  */
+/* { dg-final { scan-assembler-times {\tvcge.s[0-9]+\td[0-9]+, d[0-9]+, d[0-9]+\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tvcge.s[0-9]+\tq[0-9]+, q[0-9]+, q[0-9]+\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u[0-9]+\td[0-9]+, d[0-9]+, d[0-9]+\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tvcge.u[0-9]+\tq[0-9]+, q[0-9]+, q[0-9]+\n} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvcgt.s[0-9]+\td[0-9]+, d[0-9]+, d[0-9]+\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.s[0-9]+\tq[0-9]+, q[0-9]+, q[0-9]+\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u[0-9]+\td[0-9]+, d[0-9]+, d[0-9]+\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tvcgt.u[0-9]+\tq[0-9]+, q[0-9]+, q[0-9]+\n} 6 } } */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 02/13] arm: Add tests for PR target/100757
  2021-09-07  9:15 [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe Lyon
  2021-09-07  9:15 ` [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE Christophe Lyon
@ 2021-09-07  9:15 ` Christophe Lyon
  2021-09-28 11:12   ` Kyrylo Tkachov
  2021-09-07  9:15 ` [PATCH 03/13] arm: Add test for PR target/101325 Christophe Lyon
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Christophe Lyon @ 2021-09-07  9:15 UTC (permalink / raw)
  To: gcc-patches

These tests currently trigger an ICE which is fixed later in the patch
series.

The pr100757*.c testcases are derived from
gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
various types and return values different from 0 and 1 to avoid
commonalization with boolean masks.  In addition, since we should not
need these masks, the tests make sure they are not present.

2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>

	gcc/testsuite/
	PR target/100757
	* gcc.target/arm/simd/pr100757-2.c: New.
	* gcc.target/arm/simd/pr100757-3.c: New.
	* gcc.target/arm/simd/pr100757-4.c: New.
	* gcc.target/arm/simd/pr100757.c: New.

diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
new file mode 100644
index 00000000000..c2262b4d81e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+/* Derived from gcc.c-torture/compile/20160205-1.c.  */
+
+float a[32];
+int fn1(int d) {
+  int c = 4;
+  for (int b = 0; b < 32; b++)
+    if (a[b] != 2.0f)
+      c = 5;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /* Constant 2.0f.  */
+/* { dg-final { scan-assembler-times {\t.word\t4\n} 4 } } */ /* Initial value for c.  */
+/* { dg-final { scan-assembler-times {\t.word\t5\n} 4 } } */ /* Possible value for c.  */
+/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
+/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
new file mode 100644
index 00000000000..e604555c04c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+/* Copied from gcc.c-torture/compile/20160205-1.c.  */
+
+float a[32];
+float fn1(int d) {
+  float c = 4.0f;
+  for (int b = 0; b < 32; b++)
+    if (a[b] != 2.0f)
+      c = 5.0f;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /* Constant 2.0f.  */
+/* { dg-final { scan-assembler-times {\t.word\t1084227584\n} 4 } } */ /* Initial value for c (4.0).  */
+/* { dg-final { scan-assembler-times {\t.word\t1082130432\n} 4 } } */ /* Possible value for c (5.0).  */
+/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
+/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
new file mode 100644
index 00000000000..c12040c517f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+/* Derived from gcc.c-torture/compile/20160205-1.c.  */
+
+unsigned int a[32];
+int fn1(int d) {
+  int c = 2;
+  for (int b = 0; b < 32; b++)
+    if (a[b])
+      c = 3;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.  */
+/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
+/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value for c.  */
+/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible value for c.  */
diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757.c b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
new file mode 100644
index 00000000000..41d6e4e2d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+/* Derived from gcc.c-torture/compile/20160205-1.c.  */
+
+int a[32];
+int fn1(int d) {
+  int c = 2;
+  for (int b = 0; b < 32; b++)
+    if (a[b])
+      c = 3;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.  */
+/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
+/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value for c.  */
+/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible value for c.  */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 03/13] arm: Add test for PR target/101325
  2021-09-07  9:15 [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe Lyon
  2021-09-07  9:15 ` [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE Christophe Lyon
  2021-09-07  9:15 ` [PATCH 02/13] arm: Add tests for PR target/100757 Christophe Lyon
@ 2021-09-07  9:15 ` Christophe Lyon
  2021-09-28 11:14   ` Kyrylo Tkachov
  2021-09-07  9:15 ` [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass Christophe Lyon
  2021-09-13  8:33 ` [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe LYON
  4 siblings, 1 reply; 20+ messages in thread
From: Christophe Lyon @ 2021-09-07  9:15 UTC (permalink / raw)
  To: gcc-patches

This test is derived from the one provided in the PR: it is a
compile-only test because I do not have access to anything that could
execute it.  We can switch it do 'dg-do run' later, however it would
be better to write a new executable test to ensure coverage in case
the tester cannot execute such code (and it will need a new
arm_v8_1m_mve_hw or similar effective-target).

2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>

	gcc/testsuite/
	PR target/101325
	* gcc.target/arm/simd/pr101325.c: New.

diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325.c b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
new file mode 100644
index 00000000000..a466683a0b1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+
+#include <arm_mve.h>
+
+unsigned foo(int8x16_t v, int8x16_t w)
+{
+  return vcmpeqq (v, w);
+}
+/* { dg-final { scan-assembler {\tvcmp.i8  eq} } } */
+/* { dg-final { scan-assembler {\tvmrs\t r[0-9]+, P0} } } */
+/* { dg-final { scan-assembler {\tuxth} } } */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-07  9:15 [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe Lyon
                   ` (2 preceding siblings ...)
  2021-09-07  9:15 ` [PATCH 03/13] arm: Add test for PR target/101325 Christophe Lyon
@ 2021-09-07  9:15 ` Christophe Lyon
  2021-09-07  9:42   ` Richard Earnshaw
  2021-09-13  8:33 ` [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe LYON
  4 siblings, 1 reply; 20+ messages in thread
From: Christophe Lyon @ 2021-09-07  9:15 UTC (permalink / raw)
  To: gcc-patches

At some point during the development of this patch series, it appeared
that in some cases the register allocator wants “VPR or general”
rather than “VPR or general or FP” (which is the same thing as
ALL_REGS).  The series does not seem to require this anymore, but it
seems to be a good thing to do anyway, to give the register allocator
more freedom.

2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>

	gcc/
	* config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
	(REG_CLASS_NAMES): Likewise.
	(REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 015299c1534..fab39d05916 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1286,6 +1286,7 @@ enum reg_class
   SFP_REG,
   AFP_REG,
   VPR_REG,
+  GENERAL_AND_VPR_REGS,
   ALL_REGS,
   LIM_REG_CLASSES
 };
@@ -1315,6 +1316,7 @@ enum reg_class
   "SFP_REG",		\
   "AFP_REG",		\
   "VPR_REG",		\
+  "GENERAL_AND_VPR_REGS", \
   "ALL_REGS"		\
 }
 
@@ -1343,7 +1345,8 @@ enum reg_class
   { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG */	\
   { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG */	\
   { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG.  */	\
-  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS.  */	\
+  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /* GENERAL_AND_VPR_REGS.  */ \
+  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS.  */	\
 }
 
 #define FP_SYSREGS \
-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-07  9:15 ` [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass Christophe Lyon
@ 2021-09-07  9:42   ` Richard Earnshaw
  2021-09-07 12:05     ` Christophe LYON
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Earnshaw @ 2021-09-07  9:42 UTC (permalink / raw)
  To: Christophe Lyon, gcc-patches



On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
> At some point during the development of this patch series, it appeared
> that in some cases the register allocator wants “VPR or general”
> rather than “VPR or general or FP” (which is the same thing as
> ALL_REGS).  The series does not seem to require this anymore, but it
> seems to be a good thing to do anyway, to give the register allocator
> more freedom.
> 
> 2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>
> 
> 	gcc/
> 	* config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
> 	(REG_CLASS_NAMES): Likewise.
> 	(REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 015299c1534..fab39d05916 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -1286,6 +1286,7 @@ enum reg_class
>     SFP_REG,
>     AFP_REG,
>     VPR_REG,
> +  GENERAL_AND_VPR_REGS,
>     ALL_REGS,
>     LIM_REG_CLASSES
>   };
> @@ -1315,6 +1316,7 @@ enum reg_class
>     "SFP_REG",		\
>     "AFP_REG",		\
>     "VPR_REG",		\
> +  "GENERAL_AND_VPR_REGS", \
>     "ALL_REGS"		\
>   }
>   
> @@ -1343,7 +1345,8 @@ enum reg_class
>     { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG */	\
>     { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG */	\
>     { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG.  */	\
> -  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS.  */	\
> +  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /* GENERAL_AND_VPR_REGS.  */ \
> +  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS.  */	\
>   }

You've changed the definition of ALL_REGS here (to include VPR_REG), but 
not really explained why.  Is that the source of the underlying issue 
with the 'appeared' you mention?

R.


>   
>   #define FP_SYSREGS \
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-07  9:42   ` Richard Earnshaw
@ 2021-09-07 12:05     ` Christophe LYON
  2021-09-07 13:35       ` Richard Earnshaw
  0 siblings, 1 reply; 20+ messages in thread
From: Christophe LYON @ 2021-09-07 12:05 UTC (permalink / raw)
  To: Richard Earnshaw, gcc-patches


On 07/09/2021 11:42, Richard Earnshaw wrote:
>
>
> On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
>> At some point during the development of this patch series, it appeared
>> that in some cases the register allocator wants “VPR or general”
>> rather than “VPR or general or FP” (which is the same thing as
>> ALL_REGS).  The series does not seem to require this anymore, but it
>> seems to be a good thing to do anyway, to give the register allocator
>> more freedom.
>>
>> 2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>
>>
>>     gcc/
>>     * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>>     (REG_CLASS_NAMES): Likewise.
>>     (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
>>
>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>> index 015299c1534..fab39d05916 100644
>> --- a/gcc/config/arm/arm.h
>> +++ b/gcc/config/arm/arm.h
>> @@ -1286,6 +1286,7 @@ enum reg_class
>>     SFP_REG,
>>     AFP_REG,
>>     VPR_REG,
>> +  GENERAL_AND_VPR_REGS,
>>     ALL_REGS,
>>     LIM_REG_CLASSES
>>   };
>> @@ -1315,6 +1316,7 @@ enum reg_class
>>     "SFP_REG",        \
>>     "AFP_REG",        \
>>     "VPR_REG",        \
>> +  "GENERAL_AND_VPR_REGS", \
>>     "ALL_REGS"        \
>>   }
>>   @@ -1343,7 +1345,8 @@ enum reg_class
>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG 
>> */    \
>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG 
>> */    \
>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG.  
>> */    \
>> -  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS.  
>> */    \
>> +  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /* 
>> GENERAL_AND_VPR_REGS.  */ \
>> +  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS.  
>> */    \
>>   }
>
> You've changed the definition of ALL_REGS here (to include VPR_REG), 
> but not really explained why.  Is that the source of the underlying 
> issue with the 'appeared' you mention?


I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I 
create a new GENERAL_AND_VPR_REGS that would be more restrictive. I did 
not remove VPR_REG from ALL_REGS because I thought it was an omission: 
shouldn't ALL_REGS contain all registers?


>
> R.
>
>
>>     #define FP_SYSREGS \
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-07 12:05     ` Christophe LYON
@ 2021-09-07 13:35       ` Richard Earnshaw
  2021-09-08  7:48         ` Christophe LYON
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Earnshaw @ 2021-09-07 13:35 UTC (permalink / raw)
  To: Christophe LYON, gcc-patches



On 07/09/2021 13:05, Christophe LYON wrote:
> 
> On 07/09/2021 11:42, Richard Earnshaw wrote:
>>
>>
>> On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
>>> At some point during the development of this patch series, it appeared
>>> that in some cases the register allocator wants “VPR or general”
>>> rather than “VPR or general or FP” (which is the same thing as
>>> ALL_REGS).  The series does not seem to require this anymore, but it
>>> seems to be a good thing to do anyway, to give the register allocator
>>> more freedom.
>>>
>>> 2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>
>>>
>>>     gcc/
>>>     * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>>>     (REG_CLASS_NAMES): Likewise.
>>>     (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
>>>
>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>>> index 015299c1534..fab39d05916 100644
>>> --- a/gcc/config/arm/arm.h
>>> +++ b/gcc/config/arm/arm.h
>>> @@ -1286,6 +1286,7 @@ enum reg_class
>>>     SFP_REG,
>>>     AFP_REG,
>>>     VPR_REG,
>>> +  GENERAL_AND_VPR_REGS,
>>>     ALL_REGS,
>>>     LIM_REG_CLASSES
>>>   };
>>> @@ -1315,6 +1316,7 @@ enum reg_class
>>>     "SFP_REG",        \
>>>     "AFP_REG",        \
>>>     "VPR_REG",        \
>>> +  "GENERAL_AND_VPR_REGS", \
>>>     "ALL_REGS"        \
>>>   }
>>>   @@ -1343,7 +1345,8 @@ enum reg_class
>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG 
>>> */    \
>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG 
>>> */    \
>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG. 
>>> */    \
>>> -  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS. 
>>> */    \
>>> +  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /* 
>>> GENERAL_AND_VPR_REGS.  */ \
>>> +  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS. 
>>> */    \
>>>   }
>>
>> You've changed the definition of ALL_REGS here (to include VPR_REG), 
>> but not really explained why.  Is that the source of the underlying 
>> issue with the 'appeared' you mention?
> 
> 
> I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I 
> create a new GENERAL_AND_VPR_REGS that would be more restrictive. I did 
> not remove VPR_REG from ALL_REGS because I thought it was an omission: 
> shouldn't ALL_REGS contain all registers?

Surely that should be a separate patch then.

R.

> 
> 
>>
>> R.
>>
>>
>>>     #define FP_SYSREGS \
>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-07 13:35       ` Richard Earnshaw
@ 2021-09-08  7:48         ` Christophe LYON
  2021-09-28 11:18           ` Kyrylo Tkachov
  0 siblings, 1 reply; 20+ messages in thread
From: Christophe LYON @ 2021-09-08  7:48 UTC (permalink / raw)
  To: Richard Earnshaw, gcc-patches


On 07/09/2021 15:35, Richard Earnshaw wrote:
>
>
> On 07/09/2021 13:05, Christophe LYON wrote:
>>
>> On 07/09/2021 11:42, Richard Earnshaw wrote:
>>>
>>>
>>> On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
>>>> At some point during the development of this patch series, it appeared
>>>> that in some cases the register allocator wants “VPR or general”
>>>> rather than “VPR or general or FP” (which is the same thing as
>>>> ALL_REGS).  The series does not seem to require this anymore, but it
>>>> seems to be a good thing to do anyway, to give the register allocator
>>>> more freedom.
>>>>
>>>> 2021-09-01  Christophe Lyon <christophe.lyon@foss.st.com>
>>>>
>>>>     gcc/
>>>>     * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>>>>     (REG_CLASS_NAMES): Likewise.
>>>>     (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
>>>>
>>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>>>> index 015299c1534..fab39d05916 100644
>>>> --- a/gcc/config/arm/arm.h
>>>> +++ b/gcc/config/arm/arm.h
>>>> @@ -1286,6 +1286,7 @@ enum reg_class
>>>>     SFP_REG,
>>>>     AFP_REG,
>>>>     VPR_REG,
>>>> +  GENERAL_AND_VPR_REGS,
>>>>     ALL_REGS,
>>>>     LIM_REG_CLASSES
>>>>   };
>>>> @@ -1315,6 +1316,7 @@ enum reg_class
>>>>     "SFP_REG",        \
>>>>     "AFP_REG",        \
>>>>     "VPR_REG",        \
>>>> +  "GENERAL_AND_VPR_REGS", \
>>>>     "ALL_REGS"        \
>>>>   }
>>>>   @@ -1343,7 +1345,8 @@ enum reg_class
>>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG 
>>>> */    \
>>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG 
>>>> */    \
>>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG. 
>>>> */    \
>>>> -  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS. 
>>>> */    \
>>>> +  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /* 
>>>> GENERAL_AND_VPR_REGS.  */ \
>>>> +  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS. 
>>>> */    \
>>>>   }
>>>
>>> You've changed the definition of ALL_REGS here (to include VPR_REG), 
>>> but not really explained why.  Is that the source of the underlying 
>>> issue with the 'appeared' you mention?
>>
>>
>> I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I 
>> create a new GENERAL_AND_VPR_REGS that would be more restrictive. I 
>> did not remove VPR_REG from ALL_REGS because I thought it was an 
>> omission: shouldn't ALL_REGS contain all registers?
>
> Surely that should be a separate patch then.

OK, I can remove that line from this patch and make a separate one-liner 
for ALL_REGS.

Thanks,

Christophe


>
> R.
>
>>
>>
>>>
>>> R.
>>>
>>>
>>>>     #define FP_SYSREGS \
>>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 00/13] ARM/MVE use vectors of boolean for predicates
  2021-09-07  9:15 [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe Lyon
                   ` (3 preceding siblings ...)
  2021-09-07  9:15 ` [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass Christophe Lyon
@ 2021-09-13  8:33 ` Christophe LYON
  2021-09-20  9:21   ` Christophe LYON
  4 siblings, 1 reply; 20+ messages in thread
From: Christophe LYON @ 2021-09-13  8:33 UTC (permalink / raw)
  To: gcc Patches

ping?


On 07/09/2021 11:15, Christophe Lyon wrote:
> This patch series addresses PR 100757 and 101325 by representing
> vectors of predicates (MVE VPR.P0 register) as vectors of booleans
> rather than using HImode.
>
> As this implies a lot of mostly mechanical changes, I have tried to
> split the patches in a way that should help reviewers, but the split
> is a bit artificial.
>
> Patches 1-3 add new tests.
>
> Patches 4-6 are small independent improvements.
>
> Patch 7 implements the predicate qualifier, but does not change any
> builtin yet.
>
> Patch 8 is the first of the two main patches, and uses the new
> qualifier to describe the vcmp and vpsel builtins that are useful for
> auto-vectorization of comparisons.
>
> Patch 9 is the second main patch, which fixes the vcond_mask expander.
>
> Patches 10-13 convert almost all the remaining builtins with HI
> operands to use the predicate qualifier.  After these, there are still
> a few builtins with HI operands left, about which I am not sure: vctp,
> vpnot, load-gather and store-scatter with v2di operands.  In fact,
> patches 11/12 update some STR/LDR qualifiers in a way that breaks
> these v2di builtins although existing tests still pass.
>
> Christophe Lyon (13):
>    arm: Add new tests for comparison vectorization with Neon and MVE
>    arm: Add tests for PR target/100757
>    arm: Add test for PR target/101325
>    arm: Add GENERAL_AND_VPR_REGS regclass
>    arm: Add support for VPR_REG in arm_class_likely_spilled_p
>    arm: Fix mve_vmvnq_n_<supf><mode> argument mode
>    arm: Implement MVE predicates as vectors of booleans
>    arm: Implement auto-vectorized MVE comparisons with vectors of boolean
>      predicates
>    arm: Fix vcond_mask expander for MVE (PR target/100757)
>    arm: Convert remaining MVE vcmp builtins to predicate qualifiers
>    arm: Convert more MVE builtins to predicate qualifiers
>    arm: Convert more load/store MVE builtins to predicate qualifiers
>    arm: Convert more MVE/CDE builtins to predicate qualifiers
>
>   gcc/config/arm/arm-builtins.c                 | 228 +++--
>   gcc/config/arm/arm-modes.def                  |   5 +
>   gcc/config/arm/arm-protos.h                   |   3 +-
>   gcc/config/arm/arm-simd-builtin-types.def     |   4 +
>   gcc/config/arm/arm.c                          | 128 ++-
>   gcc/config/arm/arm.h                          |   5 +-
>   gcc/config/arm/arm_mve_builtins.def           | 746 ++++++++--------
>   gcc/config/arm/iterators.md                   |   5 +
>   gcc/config/arm/mve.md                         | 823 ++++++++++--------
>   gcc/config/arm/neon.md                        |  39 +
>   gcc/config/arm/vec-common.md                  |  52 --
>   gcc/simplify-rtx.c                            |   7 +
>   .../arm/acle/cde-mve-full-assembly.c          | 264 +++---
>   .../gcc.target/arm/simd/mve-vcmp-f32-2.c      |  32 +
>   .../gcc.target/arm/simd/neon-compare-1.c      |  78 ++
>   .../gcc.target/arm/simd/neon-compare-2.c      |  13 +
>   .../gcc.target/arm/simd/neon-compare-3.c      |  14 +
>   .../arm/simd/neon-compare-scalar-1.c          |  57 ++
>   .../gcc.target/arm/simd/neon-vcmp-f16.c       |  12 +
>   .../gcc.target/arm/simd/neon-vcmp-f32-2.c     |  15 +
>   .../gcc.target/arm/simd/neon-vcmp-f32-3.c     |  12 +
>   .../gcc.target/arm/simd/neon-vcmp-f32.c       |  12 +
>   gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c |  22 +
>   .../gcc.target/arm/simd/pr100757-2.c          |  20 +
>   .../gcc.target/arm/simd/pr100757-3.c          |  20 +
>   .../gcc.target/arm/simd/pr100757-4.c          |  19 +
>   gcc/testsuite/gcc.target/arm/simd/pr100757.c  |  19 +
>   gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  14 +
>   28 files changed, 1581 insertions(+), 1087 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 00/13] ARM/MVE use vectors of boolean for predicates
  2021-09-13  8:33 ` [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe LYON
@ 2021-09-20  9:21   ` Christophe LYON
  0 siblings, 0 replies; 20+ messages in thread
From: Christophe LYON @ 2021-09-20  9:21 UTC (permalink / raw)
  To: gcc-patches

Ping?

On 13/09/2021 10:33, Christophe LYON via Gcc-patches wrote:
> ping?
>
>
> On 07/09/2021 11:15, Christophe Lyon wrote:
>> This patch series addresses PR 100757 and 101325 by representing
>> vectors of predicates (MVE VPR.P0 register) as vectors of booleans
>> rather than using HImode.
>>
>> As this implies a lot of mostly mechanical changes, I have tried to
>> split the patches in a way that should help reviewers, but the split
>> is a bit artificial.
>>
>> Patches 1-3 add new tests.
>>
>> Patches 4-6 are small independent improvements.
>>
>> Patch 7 implements the predicate qualifier, but does not change any
>> builtin yet.
>>
>> Patch 8 is the first of the two main patches, and uses the new
>> qualifier to describe the vcmp and vpsel builtins that are useful for
>> auto-vectorization of comparisons.
>>
>> Patch 9 is the second main patch, which fixes the vcond_mask expander.
>>
>> Patches 10-13 convert almost all the remaining builtins with HI
>> operands to use the predicate qualifier.  After these, there are still
>> a few builtins with HI operands left, about which I am not sure: vctp,
>> vpnot, load-gather and store-scatter with v2di operands.  In fact,
>> patches 11/12 update some STR/LDR qualifiers in a way that breaks
>> these v2di builtins although existing tests still pass.
>>
>> Christophe Lyon (13):
>>    arm: Add new tests for comparison vectorization with Neon and MVE
>>    arm: Add tests for PR target/100757
>>    arm: Add test for PR target/101325
>>    arm: Add GENERAL_AND_VPR_REGS regclass
>>    arm: Add support for VPR_REG in arm_class_likely_spilled_p
>>    arm: Fix mve_vmvnq_n_<supf><mode> argument mode
>>    arm: Implement MVE predicates as vectors of booleans
>>    arm: Implement auto-vectorized MVE comparisons with vectors of 
>> boolean
>>      predicates
>>    arm: Fix vcond_mask expander for MVE (PR target/100757)
>>    arm: Convert remaining MVE vcmp builtins to predicate qualifiers
>>    arm: Convert more MVE builtins to predicate qualifiers
>>    arm: Convert more load/store MVE builtins to predicate qualifiers
>>    arm: Convert more MVE/CDE builtins to predicate qualifiers
>>
>>   gcc/config/arm/arm-builtins.c                 | 228 +++--
>>   gcc/config/arm/arm-modes.def                  |   5 +
>>   gcc/config/arm/arm-protos.h                   |   3 +-
>>   gcc/config/arm/arm-simd-builtin-types.def     |   4 +
>>   gcc/config/arm/arm.c                          | 128 ++-
>>   gcc/config/arm/arm.h                          |   5 +-
>>   gcc/config/arm/arm_mve_builtins.def           | 746 ++++++++--------
>>   gcc/config/arm/iterators.md                   |   5 +
>>   gcc/config/arm/mve.md                         | 823 ++++++++++--------
>>   gcc/config/arm/neon.md                        |  39 +
>>   gcc/config/arm/vec-common.md                  |  52 --
>>   gcc/simplify-rtx.c                            |   7 +
>>   .../arm/acle/cde-mve-full-assembly.c          | 264 +++---
>>   .../gcc.target/arm/simd/mve-vcmp-f32-2.c      |  32 +
>>   .../gcc.target/arm/simd/neon-compare-1.c      |  78 ++
>>   .../gcc.target/arm/simd/neon-compare-2.c      |  13 +
>>   .../gcc.target/arm/simd/neon-compare-3.c      |  14 +
>>   .../arm/simd/neon-compare-scalar-1.c          |  57 ++
>>   .../gcc.target/arm/simd/neon-vcmp-f16.c       |  12 +
>>   .../gcc.target/arm/simd/neon-vcmp-f32-2.c     |  15 +
>>   .../gcc.target/arm/simd/neon-vcmp-f32-3.c     |  12 +
>>   .../gcc.target/arm/simd/neon-vcmp-f32.c       |  12 +
>>   gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c |  22 +
>>   .../gcc.target/arm/simd/pr100757-2.c          |  20 +
>>   .../gcc.target/arm/simd/pr100757-3.c          |  20 +
>>   .../gcc.target/arm/simd/pr100757-4.c          |  19 +
>>   gcc/testsuite/gcc.target/arm/simd/pr100757.c  |  19 +
>>   gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  14 +
>>   28 files changed, 1581 insertions(+), 1087 deletions(-)
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
>>   create mode 100644 
>> gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
>>   create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE
  2021-09-07  9:15 ` [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE Christophe Lyon
@ 2021-09-28 11:11   ` Kyrylo Tkachov
  0 siblings, 0 replies; 20+ messages in thread
From: Kyrylo Tkachov @ 2021-09-28 11:11 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: gcc-patches

Hi Christophe,

Sorry for the delay.

> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:15
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 01/13] arm: Add new tests for comparison vectorization
> with Neon and MVE
> 
> This patch mainly adds Neon tests similar to existing MVE ones,
> to make sure we do not break Neon when fixing MVE.
> 
> mve-vcmp-f32-2.c is similar to mve-vcmp-f32.c but uses a conditional
> with 2.0f and 3.0f constants to help scan-assembler-times.
> 
> 2021-09-01  Christophe Lyon <christophe.lyon@foss.st.com>
> 
> 	gcc/testsuite/
> 	* gcc.target/arm/simd/mve-vcmp-f32-2.c: New.
> 	* gcc.target/arm/simd/neon-compare-1.c: New.
> 	* gcc.target/arm/simd/neon-compare-2.c: New.
> 	* gcc.target/arm/simd/neon-compare-3.c: New.
> 	* gcc.target/arm/simd/neon-compare-scalar-1.c: New.
> 	* gcc.target/arm/simd/neon-vcmp-f16.c: New.
> 	* gcc.target/arm/simd/neon-vcmp-f32-2.c: New.
> 	* gcc.target/arm/simd/neon-vcmp-f32-3.c: New.
> 	* gcc.target/arm/simd/neon-vcmp-f32.c: New.
> 	* gcc.target/arm/simd/neon-vcmp.c: New.

Thanks,
Kyrill

> 
> diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
> b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
> new file mode 100644
> index 00000000000..917a95bf141
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
> @@ -0,0 +1,32 @@
> +/* { dg-do assemble } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +
> +#include <stdint.h>
> +
> +#define NB 4
> +
> +#define FUNC(OP, NAME)
> 	\
> +  void test_ ## NAME ##_f (float * __restrict__ dest, float *a, float *b) { \
> +    int i;								\
> +    for (i=0; i<NB; i++) {						\
> +      dest[i] = (a[i] OP b[i]) ? 2.0f : 3.0f;				\
> +    }									\
> +  }
> +
> +FUNC(==, vcmpeq)
> +FUNC(!=, vcmpne)
> +FUNC(<, vcmplt)
> +FUNC(<=, vcmple)
> +FUNC(>, vcmpgt)
> +FUNC(>=, vcmpge)
> +
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\teq, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tne, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tlt, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tle, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tgt, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tge, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 24 } } */ /*
> Constant 2.0f.  */
> +/* { dg-final { scan-assembler-times {\t.word\t1077936128\n} 24 } } */ /*
> Constant 3.0f.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
> new file mode 100644
> index 00000000000..2e0222a71f2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
> @@ -0,0 +1,78 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3" } */
> +
> +#include "mve-compare-1.c"
> +
> +/* 64-bit vectors.  */
> +/* vmvn is used by 'ne' comparisons: 3 sizes * 2 (signed/unsigned) * 2
> +   (register/zero) = 12.  */
> +/* { dg-final { scan-assembler-times {\tvmvn\td[0-9]+, d[0-9]+\n} 12 } } */
> +
> +/* { 8 bits } x { eq, ne, lt, le, gt, ge }. */
> +/* ne uses eq, lt/le only apply to comparison with zero, they use gt/ge
> +   otherwise.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, #0\n} 4 } }
> */
> +/* { dg-final { scan-assembler-times {\tvclt.s8\td[0-9]+, d[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s8\td[0-9]+, d[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, #0\n} 1 } }
> */
> +
> +/* { 16 bits } x { eq, ne, lt, le, gt, ge }. */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, #0\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvclt.s16\td[0-9]+, d[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s16\td[0-9]+, d[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s16\td[0-9]+, d[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s16\td[0-9]+, d[0-9]+, #0\n}
> 1 } } */
> +
> +/* { 32 bits } x { eq, ne, lt, le, gt, ge }. */
> +/* { dg-final { scan-assembler-times {\tvceq.i32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i32\td[0-9]+, d[0-9]+, #0\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvclt.s32\td[0-9]+, d[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s32\td[0-9]+, d[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s32\td[0-9]+, d[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s32\td[0-9]+, d[0-9]+, #0\n}
> 1 } } */
> +
> +/* 128-bit vectors.  */
> +
> +/* vmvn is used by 'ne' comparisons.  */
> +/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 12 } } */
> +
> +/* { 8 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\tq[0-9]+, q[0-9]+, #0\n} 4 } }
> */
> +/* { dg-final { scan-assembler-times {\tvclt.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\tq[0-9]+, q[0-9]+, #0\n} 1 } }
> */
> +
> +/* { 16 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\tq[0-9]+, q[0-9]+, #0\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvclt.s16\tq[0-9]+, q[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s16\tq[0-9]+, q[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s16\tq[0-9]+, q[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s16\tq[0-9]+, q[0-9]+, #0\n}
> 1 } } */
> +
> +/* { 32 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i32\tq[0-9]+, q[0-9]+, #0\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvclt.s32\tq[0-9]+, q[0-9]+, #0\n} 1 } }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s32\tq[0-9]+, q[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s32\tq[0-9]+, q[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s32\tq[0-9]+, q[0-9]+, #0\n}
> 1 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
> new file mode 100644
> index 00000000000..06f3c14c91e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +
> +#include "mve-compare-2.c"
> +
> +/* eq, ne, lt, le, gt, ge.  */
> +/* ne uses eq+vmvn, lt/le use gt/ge with swapped operands.  */
> +/* { dg-final { scan-assembler-times {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
> new file mode 100644
> index 00000000000..9c9f108843b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */
> +/* { dg-add-options arm_v8_2a_fp16_neon } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +
> +#include "mve-compare-3.c"
> +
> +
> +/* eq, ne, lt, le, gt, ge.  */
> +/* ne uses eq+vmvn, lt/le use gt/ge with swapped operands.  */
> +/* { dg-final { scan-assembler-times {\tvceq.f16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.f16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.f16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
> new file mode 100644
> index 00000000000..0783624a3f2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
> @@ -0,0 +1,57 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3" } */
> +
> +#include "mve-compare-scalar-1.c"
> +
> +/* 64-bit vectors.  */
> +/* vmvn is used by 'ne' comparisons.  */
> +/* { dg-final { scan-assembler-times {\tvmvn\td[0-9]+, d[0-9]+\n} 6 } } */
> +
> +/* { 8 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +
> +/* { 16 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +
> +/* { 32 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u32\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 2 } } */
> +
> +/* 128-bit vectors.  */
> +
> +/* vmvn is used by 'ne' comparisons.  */
> +/* { dg-final { scan-assembler-times {\tvmvn\tq[0-9]+, q[0-9]+\n} 6 } } */
> +
> +/* { 8 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u8\tq[0-9]+, q[0-9]+, q[0-9]+\n}
> 2 } } */
> +
> +/* { 16 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +
> +/* { 32 bits } x { eq, ne, lt, le, gt, ge }.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
> new file mode 100644
> index 00000000000..688fd9a235f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */
> +/* { dg-add-options arm_v8_2a_fp16_neon } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +
> +#include "mve-vcmp-f16.c"
> +
> +/* 'ne' uses vceq.  */
> +/* le and lt use ge and gt with inverted operands.  */
> +/* { dg-final { scan-assembler-times {\tvceq.f16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.f16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.f16\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
> new file mode 100644
> index 00000000000..a22923eb242
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +
> +#include "mve-vcmp-f32-2.c"
> +
> +/* 'ne' uses vceq.  */
> +/* le and lt use ge and gt with inverted operands.  */
> +/* { dg-final { scan-assembler-times {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvmov.f32\tq[0-9]+, #2.0e\+0} 6 } }
> */
> +/* { dg-final { scan-assembler-times {\tvmov.f32\tq[0-9]+, #3.0e\+0} 6 } }
> */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
> new file mode 100644
> index 00000000000..4f12f043d3a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3" } */
> +
> +#include "mve-vcmp-f32.c"
> +
> +/* Should not be vectorized, since we do not use -funsafe-math-
> optimizations.  */
> +
> +/* { dg-final { scan-assembler-not {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} } } */
> +/* { dg-final { scan-assembler-not {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} } } */
> +/* { dg-final { scan-assembler-not {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} } } */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
> new file mode 100644
> index 00000000000..06e5c4fd1d1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +
> +#include "mve-vcmp-f32.c"
> +
> +/* 'ne' uses vceq.  */
> +/* le and lt use ge and gt with inverted operands.  */
> +/* { dg-final { scan-assembler-times {\tvceq.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.f32\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
> new file mode 100644
> index 00000000000..f2b92b1be7f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3" } */
> +
> +#include "mve-vcmp.c"
> +
> +/* vceq is also used for 'ne' comparisons.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i[0-9]+\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 12 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i[0-9]+\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 12 } } */
> +
> +/* lt and le are replaced with the opposite condition, hence the double
> number
> +   of matches for gt and ge.  */
> +/* { dg-final { scan-assembler-times {\tvcge.s[0-9]+\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 6 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s[0-9]+\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 6 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u[0-9]+\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 6 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.u[0-9]+\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 6 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvcgt.s[0-9]+\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 6 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s[0-9]+\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 6 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u[0-9]+\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 6 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.u[0-9]+\tq[0-9]+, q[0-9]+, q[0-
> 9]+\n} 6 } } */
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH 02/13] arm: Add tests for PR target/100757
  2021-09-07  9:15 ` [PATCH 02/13] arm: Add tests for PR target/100757 Christophe Lyon
@ 2021-09-28 11:12   ` Kyrylo Tkachov
  2021-09-28 13:28     ` Christophe LYON
  0 siblings, 1 reply; 20+ messages in thread
From: Kyrylo Tkachov @ 2021-09-28 11:12 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: gcc-patches



> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:15
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 02/13] arm: Add tests for PR target/100757
> 
> These tests currently trigger an ICE which is fixed later in the patch
> series.
> 
> The pr100757*.c testcases are derived from
> gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
> various types and return values different from 0 and 1 to avoid
> commonalization with boolean masks.  In addition, since we should not
> need these masks, the tests make sure they are not present.

Ok, but I'd rather it was committed together with the patch that fixes the ICE.
I don't mind if it's a separate commit or rolled into that patch.

Thanks,
Kyrill

> 
> 2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>
> 
> 	gcc/testsuite/
> 	PR target/100757
> 	* gcc.target/arm/simd/pr100757-2.c: New.
> 	* gcc.target/arm/simd/pr100757-3.c: New.
> 	* gcc.target/arm/simd/pr100757-4.c: New.
> 	* gcc.target/arm/simd/pr100757.c: New.
> 
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
> new file mode 100644
> index 00000000000..c2262b4d81e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
> +
> +float a[32];
> +int fn1(int d) {
> +  int c = 4;
> +  for (int b = 0; b < 32; b++)
> +    if (a[b] != 2.0f)
> +      c = 5;
> +  return c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
> Constant 2.0f.  */
> +/* { dg-final { scan-assembler-times {\t.word\t4\n} 4 } } */ /* Initial value
> for c.  */
> +/* { dg-final { scan-assembler-times {\t.word\t5\n} 4 } } */ /* Possible
> value for c.  */
> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
> +/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
> new file mode 100644
> index 00000000000..e604555c04c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +/* Copied from gcc.c-torture/compile/20160205-1.c.  */
> +
> +float a[32];
> +float fn1(int d) {
> +  float c = 4.0f;
> +  for (int b = 0; b < 32; b++)
> +    if (a[b] != 2.0f)
> +      c = 5.0f;
> +  return c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
> Constant 2.0f.  */
> +/* { dg-final { scan-assembler-times {\t.word\t1084227584\n} 4 } } */ /*
> Initial value for c (4.0).  */
> +/* { dg-final { scan-assembler-times {\t.word\t1082130432\n} 4 } } */ /*
> Possible value for c (5.0).  */
> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
> +/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
> new file mode 100644
> index 00000000000..c12040c517f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O3" } */
> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
> +
> +unsigned int a[32];
> +int fn1(int d) {
> +  int c = 2;
> +  for (int b = 0; b < 32; b++)
> +    if (a[b])
> +      c = 3;
> +  return c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.
> */
> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
> +/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value
> for c.  */
> +/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible
> value for c.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
> new file mode 100644
> index 00000000000..41d6e4e2d7a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O3" } */
> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
> +
> +int a[32];
> +int fn1(int d) {
> +  int c = 2;
> +  for (int b = 0; b < 32; b++)
> +    if (a[b])
> +      c = 3;
> +  return c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.
> */
> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
> +/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value
> for c.  */
> +/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible
> value for c.  */
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH 03/13] arm: Add test for PR target/101325
  2021-09-07  9:15 ` [PATCH 03/13] arm: Add test for PR target/101325 Christophe Lyon
@ 2021-09-28 11:14   ` Kyrylo Tkachov
  2021-09-28 13:30     ` Christophe LYON
  0 siblings, 1 reply; 20+ messages in thread
From: Kyrylo Tkachov @ 2021-09-28 11:14 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: gcc-patches, Alex Coplan, Andrea Corallo



> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:15
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 03/13] arm: Add test for PR target/101325
> 
> This test is derived from the one provided in the PR: it is a
> compile-only test because I do not have access to anything that could
> execute it.  We can switch it do 'dg-do run' later, however it would
> be better to write a new executable test to ensure coverage in case
> the tester cannot execute such code (and it will need a new
> arm_v8_1m_mve_hw or similar effective-target).

The test is okay for now.
I think we'll want to have a arm_v8_1m_mve_hw target sooner or later.
Maybe Alex or Andrea can help to write one we can use?

Thanks,
Kyrill

> 
> 2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>
> 
> 	gcc/testsuite/
> 	PR target/101325
> 	* gcc.target/arm/simd/pr101325.c: New.
> 
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325.c
> b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
> new file mode 100644
> index 00000000000..a466683a0b1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O3" } */
> +
> +#include <arm_mve.h>
> +
> +unsigned foo(int8x16_t v, int8x16_t w)
> +{
> +  return vcmpeqq (v, w);
> +}
> +/* { dg-final { scan-assembler {\tvcmp.i8  eq} } } */
> +/* { dg-final { scan-assembler {\tvmrs\t r[0-9]+, P0} } } */
> +/* { dg-final { scan-assembler {\tuxth} } } */
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-08  7:48         ` Christophe LYON
@ 2021-09-28 11:18           ` Kyrylo Tkachov
  2021-09-28 13:32             ` Christophe LYON
  0 siblings, 1 reply; 20+ messages in thread
From: Kyrylo Tkachov @ 2021-09-28 11:18 UTC (permalink / raw)
  To: Christophe LYON, Richard Earnshaw; +Cc: gcc-patches

Hi Christophe,

> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
> LYON via Gcc-patches
> Sent: 08 September 2021 08:49
> To: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>; gcc-
> patches@gcc.gnu.org
> Subject: Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
> 
> 
> On 07/09/2021 15:35, Richard Earnshaw wrote:
> >
> >
> > On 07/09/2021 13:05, Christophe LYON wrote:
> >>
> >> On 07/09/2021 11:42, Richard Earnshaw wrote:
> >>>
> >>>
> >>> On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
> >>>> At some point during the development of this patch series, it appeared
> >>>> that in some cases the register allocator wants “VPR or general”
> >>>> rather than “VPR or general or FP” (which is the same thing as
> >>>> ALL_REGS).  The series does not seem to require this anymore, but it
> >>>> seems to be a good thing to do anyway, to give the register allocator
> >>>> more freedom.
> >>>>
> >>>> 2021-09-01  Christophe Lyon <christophe.lyon@foss.st.com>
> >>>>
> >>>>     gcc/
> >>>>     * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
> >>>>     (REG_CLASS_NAMES): Likewise.
> >>>>     (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
> >>>>
> >>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> >>>> index 015299c1534..fab39d05916 100644
> >>>> --- a/gcc/config/arm/arm.h
> >>>> +++ b/gcc/config/arm/arm.h
> >>>> @@ -1286,6 +1286,7 @@ enum reg_class
> >>>>     SFP_REG,
> >>>>     AFP_REG,
> >>>>     VPR_REG,
> >>>> +  GENERAL_AND_VPR_REGS,
> >>>>     ALL_REGS,
> >>>>     LIM_REG_CLASSES
> >>>>   };
> >>>> @@ -1315,6 +1316,7 @@ enum reg_class
> >>>>     "SFP_REG",        \
> >>>>     "AFP_REG",        \
> >>>>     "VPR_REG",        \
> >>>> +  "GENERAL_AND_VPR_REGS", \
> >>>>     "ALL_REGS"        \
> >>>>   }
> >>>>   @@ -1343,7 +1345,8 @@ enum reg_class
> >>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG
> >>>> */    \
> >>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG
> >>>> */    \
> >>>>     { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG.
> >>>> */    \
> >>>> -  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS.
> >>>> */    \
> >>>> +  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /*
> >>>> GENERAL_AND_VPR_REGS.  */ \
> >>>> +  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS.
> >>>> */    \
> >>>>   }
> >>>
> >>> You've changed the definition of ALL_REGS here (to include VPR_REG),
> >>> but not really explained why.  Is that the source of the underlying
> >>> issue with the 'appeared' you mention?
> >>
> >>
> >> I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I
> >> create a new GENERAL_AND_VPR_REGS that would be more restrictive. I
> >> did not remove VPR_REG from ALL_REGS because I thought it was an
> >> omission: shouldn't ALL_REGS contain all registers?
> >
> > Surely that should be a separate patch then.
> 
> OK, I can remove that line from this patch and make a separate one-liner
> for ALL_REGS.

Did you end up sending that patch out? (Sorry, I may have missed it in my archive).
This patch to add GENERAL_AND_VPR_REGS is okay with the ALL_REGS change separated out.

Thanks,
Kyrill

> 
> Thanks,
> 
> Christophe
> 
> 
> >
> > R.
> >
> >>
> >>
> >>>
> >>> R.
> >>>
> >>>
> >>>>     #define FP_SYSREGS \
> >>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 02/13] arm: Add tests for PR target/100757
  2021-09-28 11:12   ` Kyrylo Tkachov
@ 2021-09-28 13:28     ` Christophe LYON
  0 siblings, 0 replies; 20+ messages in thread
From: Christophe LYON @ 2021-09-28 13:28 UTC (permalink / raw)
  To: Kyrylo Tkachov; +Cc: gcc-patches


On 28/09/2021 13:12, Kyrylo Tkachov wrote:
>
>> -----Original Message-----
>> From: Gcc-patches <gcc-patches-
>> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
>> Lyon via Gcc-patches
>> Sent: 07 September 2021 10:15
>> To: gcc-patches@gcc.gnu.org
>> Subject: [PATCH 02/13] arm: Add tests for PR target/100757
>>
>> These tests currently trigger an ICE which is fixed later in the patch
>> series.
>>
>> The pr100757*.c testcases are derived from
>> gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
>> various types and return values different from 0 and 1 to avoid
>> commonalization with boolean masks.  In addition, since we should not
>> need these masks, the tests make sure they are not present.
> Ok, but I'd rather it was committed together with the patch that fixes the ICE.
> I don't mind if it's a separate commit or rolled into that patch.


Sure, I'll wait for the main patch approval. I split it this way to 
hopefully make the reviews easier, to exercise the testcase without the 
fix proposal.

Thanks,

Christophe


>
> Thanks,
> Kyrill
>
>> 2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>
>>
>> 	gcc/testsuite/
>> 	PR target/100757
>> 	* gcc.target/arm/simd/pr100757-2.c: New.
>> 	* gcc.target/arm/simd/pr100757-3.c: New.
>> 	* gcc.target/arm/simd/pr100757-4.c: New.
>> 	* gcc.target/arm/simd/pr100757.c: New.
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
>> b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
>> new file mode 100644
>> index 00000000000..c2262b4d81e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>> +/* { dg-add-options arm_v8_1m_mve_fp } */
>> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
>> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
>> +
>> +float a[32];
>> +int fn1(int d) {
>> +  int c = 4;
>> +  for (int b = 0; b < 32; b++)
>> +    if (a[b] != 2.0f)
>> +      c = 5;
>> +  return c;
>> +}
>> +
>> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
>> Constant 2.0f.  */
>> +/* { dg-final { scan-assembler-times {\t.word\t4\n} 4 } } */ /* Initial value
>> for c.  */
>> +/* { dg-final { scan-assembler-times {\t.word\t5\n} 4 } } */ /* Possible
>> value for c.  */
>> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
>> +/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
>> b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
>> new file mode 100644
>> index 00000000000..e604555c04c
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>> +/* { dg-add-options arm_v8_1m_mve_fp } */
>> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
>> +/* Copied from gcc.c-torture/compile/20160205-1.c.  */
>> +
>> +float a[32];
>> +float fn1(int d) {
>> +  float c = 4.0f;
>> +  for (int b = 0; b < 32; b++)
>> +    if (a[b] != 2.0f)
>> +      c = 5.0f;
>> +  return c;
>> +}
>> +
>> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
>> Constant 2.0f.  */
>> +/* { dg-final { scan-assembler-times {\t.word\t1084227584\n} 4 } } */ /*
>> Initial value for c (4.0).  */
>> +/* { dg-final { scan-assembler-times {\t.word\t1082130432\n} 4 } } */ /*
>> Possible value for c (5.0).  */
>> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
>> +/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
>> b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
>> new file mode 100644
>> index 00000000000..c12040c517f
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
>> @@ -0,0 +1,19 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>> +/* { dg-add-options arm_v8_1m_mve } */
>> +/* { dg-additional-options "-O3" } */
>> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
>> +
>> +unsigned int a[32];
>> +int fn1(int d) {
>> +  int c = 2;
>> +  for (int b = 0; b < 32; b++)
>> +    if (a[b])
>> +      c = 3;
>> +  return c;
>> +}
>> +
>> +/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.
>> */
>> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
>> +/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value
>> for c.  */
>> +/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible
>> value for c.  */
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757.c
>> b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
>> new file mode 100644
>> index 00000000000..41d6e4e2d7a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
>> @@ -0,0 +1,19 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>> +/* { dg-add-options arm_v8_1m_mve } */
>> +/* { dg-additional-options "-O3" } */
>> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
>> +
>> +int a[32];
>> +int fn1(int d) {
>> +  int c = 2;
>> +  for (int b = 0; b < 32; b++)
>> +    if (a[b])
>> +      c = 3;
>> +  return c;
>> +}
>> +
>> +/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.
>> */
>> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
>> +/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value
>> for c.  */
>> +/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible
>> value for c.  */
>> --
>> 2.25.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 03/13] arm: Add test for PR target/101325
  2021-09-28 11:14   ` Kyrylo Tkachov
@ 2021-09-28 13:30     ` Christophe LYON
  2021-10-11 12:43       ` Christophe LYON
  0 siblings, 1 reply; 20+ messages in thread
From: Christophe LYON @ 2021-09-28 13:30 UTC (permalink / raw)
  To: Kyrylo Tkachov; +Cc: gcc-patches, Alex Coplan, Andrea Corallo


On 28/09/2021 13:14, Kyrylo Tkachov wrote:
>
>> -----Original Message-----
>> From: Gcc-patches <gcc-patches-
>> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
>> Lyon via Gcc-patches
>> Sent: 07 September 2021 10:15
>> To: gcc-patches@gcc.gnu.org
>> Subject: [PATCH 03/13] arm: Add test for PR target/101325
>>
>> This test is derived from the one provided in the PR: it is a
>> compile-only test because I do not have access to anything that could
>> execute it.  We can switch it do 'dg-do run' later, however it would
>> be better to write a new executable test to ensure coverage in case
>> the tester cannot execute such code (and it will need a new
>> arm_v8_1m_mve_hw or similar effective-target).
> The test is okay for now.
> I think we'll want to have a arm_v8_1m_mve_hw target sooner or later.
> Maybe Alex or Andrea can help to write one we can use?


Since I posted the patch series, QEMU has gained support for MVE, I plan 
to write a similar testcase which is executable.

There's already an executable testcase in the PR.

Thanks

Christophe


>
> Thanks,
> Kyrill
>
>> 2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>
>>
>> 	gcc/testsuite/
>> 	PR target/101325
>> 	* gcc.target/arm/simd/pr101325.c: New.
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325.c
>> b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
>> new file mode 100644
>> index 00000000000..a466683a0b1
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
>> @@ -0,0 +1,14 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>> +/* { dg-add-options arm_v8_1m_mve } */
>> +/* { dg-additional-options "-O3" } */
>> +
>> +#include <arm_mve.h>
>> +
>> +unsigned foo(int8x16_t v, int8x16_t w)
>> +{
>> +  return vcmpeqq (v, w);
>> +}
>> +/* { dg-final { scan-assembler {\tvcmp.i8  eq} } } */
>> +/* { dg-final { scan-assembler {\tvmrs\t r[0-9]+, P0} } } */
>> +/* { dg-final { scan-assembler {\tuxth} } } */
>> --
>> 2.25.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-28 11:18           ` Kyrylo Tkachov
@ 2021-09-28 13:32             ` Christophe LYON
  2021-10-11 12:44               ` Christophe LYON
  0 siblings, 1 reply; 20+ messages in thread
From: Christophe LYON @ 2021-09-28 13:32 UTC (permalink / raw)
  To: Kyrylo Tkachov, Richard Earnshaw; +Cc: gcc-patches


On 28/09/2021 13:18, Kyrylo Tkachov wrote:
> Hi Christophe,
>
>> -----Original Message-----
>> From: Gcc-patches <gcc-patches-
>> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
>> LYON via Gcc-patches
>> Sent: 08 September 2021 08:49
>> To: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>; gcc-
>> patches@gcc.gnu.org
>> Subject: Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
>>
>>
>> On 07/09/2021 15:35, Richard Earnshaw wrote:
>>>
>>> On 07/09/2021 13:05, Christophe LYON wrote:
>>>> On 07/09/2021 11:42, Richard Earnshaw wrote:
>>>>>
>>>>> On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
>>>>>> At some point during the development of this patch series, it appeared
>>>>>> that in some cases the register allocator wants “VPR or general”
>>>>>> rather than “VPR or general or FP” (which is the same thing as
>>>>>> ALL_REGS).  The series does not seem to require this anymore, but it
>>>>>> seems to be a good thing to do anyway, to give the register allocator
>>>>>> more freedom.
>>>>>>
>>>>>> 2021-09-01  Christophe Lyon <christophe.lyon@foss.st.com>
>>>>>>
>>>>>>      gcc/
>>>>>>      * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>>>>>>      (REG_CLASS_NAMES): Likewise.
>>>>>>      (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
>>>>>>
>>>>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>>>>>> index 015299c1534..fab39d05916 100644
>>>>>> --- a/gcc/config/arm/arm.h
>>>>>> +++ b/gcc/config/arm/arm.h
>>>>>> @@ -1286,6 +1286,7 @@ enum reg_class
>>>>>>      SFP_REG,
>>>>>>      AFP_REG,
>>>>>>      VPR_REG,
>>>>>> +  GENERAL_AND_VPR_REGS,
>>>>>>      ALL_REGS,
>>>>>>      LIM_REG_CLASSES
>>>>>>    };
>>>>>> @@ -1315,6 +1316,7 @@ enum reg_class
>>>>>>      "SFP_REG",        \
>>>>>>      "AFP_REG",        \
>>>>>>      "VPR_REG",        \
>>>>>> +  "GENERAL_AND_VPR_REGS", \
>>>>>>      "ALL_REGS"        \
>>>>>>    }
>>>>>>    @@ -1343,7 +1345,8 @@ enum reg_class
>>>>>>      { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG
>>>>>> */    \
>>>>>>      { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG
>>>>>> */    \
>>>>>>      { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG.
>>>>>> */    \
>>>>>> -  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS.
>>>>>> */    \
>>>>>> +  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /*
>>>>>> GENERAL_AND_VPR_REGS.  */ \
>>>>>> +  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS.
>>>>>> */    \
>>>>>>    }
>>>>> You've changed the definition of ALL_REGS here (to include VPR_REG),
>>>>> but not really explained why.  Is that the source of the underlying
>>>>> issue with the 'appeared' you mention?
>>>>
>>>> I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I
>>>> create a new GENERAL_AND_VPR_REGS that would be more restrictive. I
>>>> did not remove VPR_REG from ALL_REGS because I thought it was an
>>>> omission: shouldn't ALL_REGS contain all registers?
>>> Surely that should be a separate patch then.
>> OK, I can remove that line from this patch and make a separate one-liner
>> for ALL_REGS.
> Did you end up sending that patch out? (Sorry, I may have missed it in my archive).
> This patch to add GENERAL_AND_VPR_REGS is okay with the ALL_REGS change separated out.

No I didn't send it yet: I suspect there will be iterations on the next 
patches in the series, this small change alone wasn't worth sending a v2 :-)

Thanks,

Christophe


>
> Thanks,
> Kyrill
>
>> Thanks,
>>
>> Christophe
>>
>>
>>> R.
>>>
>>>>
>>>>> R.
>>>>>
>>>>>
>>>>>>      #define FP_SYSREGS \
>>>>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 03/13] arm: Add test for PR target/101325
  2021-09-28 13:30     ` Christophe LYON
@ 2021-10-11 12:43       ` Christophe LYON
  0 siblings, 0 replies; 20+ messages in thread
From: Christophe LYON @ 2021-10-11 12:43 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2394 bytes --]


On 28/09/2021 15:30, Christophe LYON via Gcc-patches wrote:
>
> On 28/09/2021 13:14, Kyrylo Tkachov wrote:
>>
>>> -----Original Message-----
>>> From: Gcc-patches <gcc-patches-
>>> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
>>> Lyon via Gcc-patches
>>> Sent: 07 September 2021 10:15
>>> To: gcc-patches@gcc.gnu.org
>>> Subject: [PATCH 03/13] arm: Add test for PR target/101325
>>>
>>> This test is derived from the one provided in the PR: it is a
>>> compile-only test because I do not have access to anything that could
>>> execute it.  We can switch it do 'dg-do run' later, however it would
>>> be better to write a new executable test to ensure coverage in case
>>> the tester cannot execute such code (and it will need a new
>>> arm_v8_1m_mve_hw or similar effective-target).
>> The test is okay for now.
>> I think we'll want to have a arm_v8_1m_mve_hw target sooner or later.
>> Maybe Alex or Andrea can help to write one we can use?
>
>
> Since I posted the patch series, QEMU has gained support for MVE, I 
> plan to write a similar testcase which is executable.
>
> There's already an executable testcase in the PR.
>
> Thanks
>
> Christophe
>
Here is an updated version of this patch, which adds an executable test.

I thought I would re-post the whole series later, but I haven't yet 
received feedback on the main patches, which I expect to trigger some 
discussions.

Christophe

>
>>
>> Thanks,
>> Kyrill
>>
>>> 2021-09-01  Christophe Lyon <christophe.lyon@foss.st.com>
>>>
>>>     gcc/testsuite/
>>>     PR target/101325
>>>     * gcc.target/arm/simd/pr101325.c: New.
>>>
>>> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325.c
>>> b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
>>> new file mode 100644
>>> index 00000000000..a466683a0b1
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
>>> @@ -0,0 +1,14 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>>> +/* { dg-add-options arm_v8_1m_mve } */
>>> +/* { dg-additional-options "-O3" } */
>>> +
>>> +#include <arm_mve.h>
>>> +
>>> +unsigned foo(int8x16_t v, int8x16_t w)
>>> +{
>>> +  return vcmpeqq (v, w);
>>> +}
>>> +/* { dg-final { scan-assembler {\tvcmp.i8  eq} } } */
>>> +/* { dg-final { scan-assembler {\tvmrs\t r[0-9]+, P0} } } */
>>> +/* { dg-final { scan-assembler {\tuxth} } } */
>>> -- 
>>> 2.25.1

[-- Attachment #2: v2-0003-arm-Add-tests-for-PR-target-101325.patch --]
[-- Type: text/plain, Size: 3021 bytes --]

From ef48339f8048ee6417845ed2e6fd95f550ee798e Mon Sep 17 00:00:00 2001
From: Christophe Lyon <christophe.lyon@foss.st.com>
Date: Wed, 25 Aug 2021 17:26:31 +0000
Subject: [PATCH v2 03/14] arm: Add tests for PR target/101325

These tests are derived from the one provided in the PR: there is a
compile-only test because I did not have access to anything that could
execute MVE code until recently.
I have been able to add an executable test since QEMU supports MVE.

Instead of adding arm_v8_1m_mve_hw, I update arm_mve_hw so that it
uses add_options_for_arm_v8_1m_mve_fp, like arm_neon_hw does.  This
ensures arm_mve_hw passes even if the toolchain does not generate MVE
code by default.

2021-10-01  Christophe Lyon  <christophe.lyon@foss.st.com>

	gcc/testsuite/
	PR target/101325
	* gcc.target/arm/simd/pr101325.c: New.
	* gcc.target/arm/simd/pr101325-2.c: New.
	* lib/target-supports.exp (check_effective_target_arm_mve_hw): Use
	add_options_for_arm_v8_1m_mve_fp.

add executable test and update check_effective_target_arm_mve_hw

diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325-2.c b/gcc/testsuite/gcc.target/arm/simd/pr101325-2.c
new file mode 100644
index 00000000000..7907a386385
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr101325-2.c
@@ -0,0 +1,19 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_mve_hw } */
+/* { dg-options "-O3" } */
+/* { dg-add-options arm_v8_1m_mve } */
+
+#include <arm_mve.h>
+
+
+__attribute((noinline,noipa))
+unsigned foo(int8x16_t v, int8x16_t w)
+{
+  return vcmpeqq (v, w);
+}
+
+int main(void)
+{
+  if (foo (vdupq_n_s8(0), vdupq_n_s8(0)) != 0xffffU)
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325.c b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
new file mode 100644
index 00000000000..a466683a0b1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+
+#include <arm_mve.h>
+
+unsigned foo(int8x16_t v, int8x16_t w)
+{
+  return vcmpeqq (v, w);
+}
+/* { dg-final { scan-assembler {\tvcmp.i8  eq} } } */
+/* { dg-final { scan-assembler {\tvmrs\t r[0-9]+, P0} } } */
+/* { dg-final { scan-assembler {\tuxth} } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index e030e4f376b..b0e35b602af 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4889,6 +4889,7 @@ proc check_effective_target_arm_cmse_hw { } {
 	}
     } "-mcmse -Wl,--section-start,.gnu.sgstubs=0x00400000"]
 }
+
 # Return 1 if the target supports executing MVE instructions, 0
 # otherwise.
 
@@ -4904,7 +4905,7 @@ proc check_effective_target_arm_mve_hw {} {
 	       : "0" (a), "r" (b));
 	  return (a != 2);
 	}
-    } ""]
+    } [add_options_for_arm_v8_1m_mve_fp ""]]
 }
 
 # Return 1 if this is an ARM target where ARMv8-M Security Extensions with
-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
  2021-09-28 13:32             ` Christophe LYON
@ 2021-10-11 12:44               ` Christophe LYON
  0 siblings, 0 replies; 20+ messages in thread
From: Christophe LYON @ 2021-10-11 12:44 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4040 bytes --]


On 28/09/2021 15:32, Christophe LYON via Gcc-patches wrote:
>
> On 28/09/2021 13:18, Kyrylo Tkachov wrote:
>> Hi Christophe,
>>
>>> -----Original Message-----
>>> From: Gcc-patches <gcc-patches-
>>> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Christophe
>>> LYON via Gcc-patches
>>> Sent: 08 September 2021 08:49
>>> To: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>; gcc-
>>> patches@gcc.gnu.org
>>> Subject: Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
>>>
>>>
>>> On 07/09/2021 15:35, Richard Earnshaw wrote:
>>>>
>>>> On 07/09/2021 13:05, Christophe LYON wrote:
>>>>> On 07/09/2021 11:42, Richard Earnshaw wrote:
>>>>>>
>>>>>> On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
>>>>>>> At some point during the development of this patch series, it 
>>>>>>> appeared
>>>>>>> that in some cases the register allocator wants “VPR or general”
>>>>>>> rather than “VPR or general or FP” (which is the same thing as
>>>>>>> ALL_REGS).  The series does not seem to require this anymore, 
>>>>>>> but it
>>>>>>> seems to be a good thing to do anyway, to give the register 
>>>>>>> allocator
>>>>>>> more freedom.
>>>>>>>
>>>>>>> 2021-09-01  Christophe Lyon <christophe.lyon@foss.st.com>
>>>>>>>
>>>>>>>      gcc/
>>>>>>>      * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>>>>>>>      (REG_CLASS_NAMES): Likewise.
>>>>>>>      (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
>>>>>>>
>>>>>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>>>>>>> index 015299c1534..fab39d05916 100644
>>>>>>> --- a/gcc/config/arm/arm.h
>>>>>>> +++ b/gcc/config/arm/arm.h
>>>>>>> @@ -1286,6 +1286,7 @@ enum reg_class
>>>>>>>      SFP_REG,
>>>>>>>      AFP_REG,
>>>>>>>      VPR_REG,
>>>>>>> +  GENERAL_AND_VPR_REGS,
>>>>>>>      ALL_REGS,
>>>>>>>      LIM_REG_CLASSES
>>>>>>>    };
>>>>>>> @@ -1315,6 +1316,7 @@ enum reg_class
>>>>>>>      "SFP_REG",        \
>>>>>>>      "AFP_REG",        \
>>>>>>>      "VPR_REG",        \
>>>>>>> +  "GENERAL_AND_VPR_REGS", \
>>>>>>>      "ALL_REGS"        \
>>>>>>>    }
>>>>>>>    @@ -1343,7 +1345,8 @@ enum reg_class
>>>>>>>      { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG
>>>>>>> */    \
>>>>>>>      { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG
>>>>>>> */    \
>>>>>>>      { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* 
>>>>>>> VPR_REG.
>>>>>>> */    \
>>>>>>> -  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F } /* ALL_REGS.
>>>>>>> */    \
>>>>>>> +  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /*
>>>>>>> GENERAL_AND_VPR_REGS.  */ \
>>>>>>> +  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F } /* ALL_REGS.
>>>>>>> */    \
>>>>>>>    }
>>>>>> You've changed the definition of ALL_REGS here (to include VPR_REG),
>>>>>> but not really explained why.  Is that the source of the underlying
>>>>>> issue with the 'appeared' you mention?
>>>>>
>>>>> I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I
>>>>> create a new GENERAL_AND_VPR_REGS that would be more restrictive. I
>>>>> did not remove VPR_REG from ALL_REGS because I thought it was an
>>>>> omission: shouldn't ALL_REGS contain all registers?
>>>> Surely that should be a separate patch then.
>>> OK, I can remove that line from this patch and make a separate 
>>> one-liner
>>> for ALL_REGS.
>> Did you end up sending that patch out? (Sorry, I may have missed it 
>> in my archive).
>> This patch to add GENERAL_AND_VPR_REGS is okay with the ALL_REGS 
>> change separated out.
>
> No I didn't send it yet: I suspect there will be iterations on the 
> next patches in the series, this small change alone wasn't worth 
> sending a v2 :-)
>
Here is the patch now split into two parts.

Christophe


> Thanks,
>
> Christophe
>
>
>>
>> Thanks,
>> Kyrill
>>
>>> Thanks,
>>>
>>> Christophe
>>>
>>>
>>>> R.
>>>>
>>>>>
>>>>>> R.
>>>>>>
>>>>>>
>>>>>>>      #define FP_SYSREGS \
>>>>>>>

[-- Attachment #2: v2-0004-arm-Add-GENERAL_AND_VPR_REGS-regclass.patch --]
[-- Type: text/plain, Size: 1718 bytes --]

From c57fb3fc853d8bf04f589682f03e9d3baac2dbd5 Mon Sep 17 00:00:00 2001
From: Christophe Lyon <christophe.lyon@foss.st.com>
Date: Thu, 26 Aug 2021 16:01:58 +0000
Subject: [PATCH v2 04/14] arm: Add GENERAL_AND_VPR_REGS regclass
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

At some point during the development of this patch series, it appeared
that in some cases the register allocator wants “VPR or general”
rather than “VPR or general or FP” (which is the same thing as
ALL_REGS).  The series does not seem to require this anymore, but it
seems to be a good thing to do anyway, to give the register allocator
more freedom.

2021-09-01  Christophe Lyon  <christophe.lyon@foss.st.com>

	gcc/
	* config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
	(REG_CLASS_NAMES): Likewise.
	(REG_CLASS_CONTENTS): Likewise.

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 015299c1534..eae1b1cd0fb 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1286,6 +1286,7 @@ enum reg_class
   SFP_REG,
   AFP_REG,
   VPR_REG,
+  GENERAL_AND_VPR_REGS,
   ALL_REGS,
   LIM_REG_CLASSES
 };
@@ -1315,6 +1316,7 @@ enum reg_class
   "SFP_REG",		\
   "AFP_REG",		\
   "VPR_REG",		\
+  "GENERAL_AND_VPR_REGS", \
   "ALL_REGS"		\
 }
 
@@ -1343,6 +1345,7 @@ enum reg_class
   { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG */	\
   { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG */	\
   { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG.  */	\
+  { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /* GENERAL_AND_VPR_REGS.  */ \
   { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS.  */	\
 }
 
-- 
2.25.1


[-- Attachment #3: v2-0014-arm-Add-VPR_REG-to-ALL_REGS.patch --]
[-- Type: text/plain, Size: 1022 bytes --]

From ce9429d59d513b2998f73c6e256702ad447f2ae7 Mon Sep 17 00:00:00 2001
From: Christophe Lyon <christophe.lyon@foss.st.com>
Date: Wed, 8 Sep 2021 08:27:39 +0000
Subject: [PATCH v2 14/14] arm: Add VPR_REG to ALL_REGS

VPR_REG should be part of ALL_REGS, this patch fixes this omission.

2021-09-08  Christophe Lyon  <christophe.lyon@foss.st.com>

	gcc/
	* config/arm/arm.h (REG_CLASS_CONTENTS): Add VPR_REG to ALL_REGS.

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index eae1b1cd0fb..fab39d05916 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1346,7 +1346,7 @@ enum reg_class
   { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG */	\
   { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG.  */	\
   { 0x00005FFF, 0x00000000, 0x00000000, 0x00000400 }, /* GENERAL_AND_VPR_REGS.  */ \
-  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS.  */	\
+  { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000040F }  /* ALL_REGS.  */	\
 }
 
 #define FP_SYSREGS \
-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-10-11 12:45 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-07  9:15 [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe Lyon
2021-09-07  9:15 ` [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE Christophe Lyon
2021-09-28 11:11   ` Kyrylo Tkachov
2021-09-07  9:15 ` [PATCH 02/13] arm: Add tests for PR target/100757 Christophe Lyon
2021-09-28 11:12   ` Kyrylo Tkachov
2021-09-28 13:28     ` Christophe LYON
2021-09-07  9:15 ` [PATCH 03/13] arm: Add test for PR target/101325 Christophe Lyon
2021-09-28 11:14   ` Kyrylo Tkachov
2021-09-28 13:30     ` Christophe LYON
2021-10-11 12:43       ` Christophe LYON
2021-09-07  9:15 ` [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass Christophe Lyon
2021-09-07  9:42   ` Richard Earnshaw
2021-09-07 12:05     ` Christophe LYON
2021-09-07 13:35       ` Richard Earnshaw
2021-09-08  7:48         ` Christophe LYON
2021-09-28 11:18           ` Kyrylo Tkachov
2021-09-28 13:32             ` Christophe LYON
2021-10-11 12:44               ` Christophe LYON
2021-09-13  8:33 ` [PATCH 00/13] ARM/MVE use vectors of boolean for predicates Christophe LYON
2021-09-20  9:21   ` Christophe LYON

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).