* [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
@ 2021-01-06 6:49 Hongtao Liu
2021-01-06 14:39 ` Jakub Jelinek
0 siblings, 1 reply; 8+ messages in thread
From: Hongtao Liu @ 2021-01-06 6:49 UTC (permalink / raw)
To: GCC Patches, Kirill Yukhin; +Cc: Jakub Jelinek, H. J. Lu
[-- Attachment #1: Type: text/plain, Size: 733 bytes --]
Hi:
ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
called in upper 2 functions.) may return integer mask whenever integer
mask is available, so convert integer mask back to vector mask if
needed.
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
When cmp is integer mask, convert it to vector.
(ix86_expand_int_vec_cmp): Ditto.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
--
BR,
Hongtao
[-- Attachment #2: 0001-AVX512-Fix-ICE-Convert-integer-mask-to-vector-in-ix8.patch --]
[-- Type: application/x-patch, Size: 4725 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
2021-01-06 6:49 [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537] Hongtao Liu
@ 2021-01-06 14:39 ` Jakub Jelinek
2021-01-07 5:22 ` Hongtao Liu
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2021-01-06 14:39 UTC (permalink / raw)
To: Hongtao Liu; +Cc: GCC Patches, Kirill Yukhin, H. J. Lu
On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote:
> ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
> for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
> called in upper 2 functions.) may return integer mask whenever integer
> mask is available, so convert integer mask back to vector mask if
> needed.
>
> gcc/ChangeLog:
>
> PR target/98537
> * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
> When cmp is integer mask, convert it to vector.
> (ix86_expand_int_vec_cmp): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> PR target/98537
> * g++.target/i386/avx512bw-pr98537-1.C: New test.
> * g++.target/i386/avx512vl-pr98537-1.C: New test.
> * g++.target/i386/avx512vl-pr98537-2.C: New test.
Do we optimize it then to an AVX/AVX2 comparison if possible?
@@ -4024,8 +4025,18 @@ ix86_expand_fp_vec_cmp (rtx operands[])
cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
operands[1], operands[2]);
- if (operands[0] != cmp)
- emit_move_insn (operands[0], cmp);
+ if (operands[0] != cmp)
+ {
The indentation of the if above looks wrong.
Otherwise LGTM.
Jakub
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
2021-01-06 14:39 ` Jakub Jelinek
@ 2021-01-07 5:22 ` Hongtao Liu
2021-01-14 11:16 ` Hongtao Liu
0 siblings, 1 reply; 8+ messages in thread
From: Hongtao Liu @ 2021-01-07 5:22 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: GCC Patches, Kirill Yukhin, H. J. Lu
[-- Attachment #1: Type: text/plain, Size: 3017 bytes --]
On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote:
> > ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
> > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
> > called in upper 2 functions.) may return integer mask whenever integer
> > mask is available, so convert integer mask back to vector mask if
> > needed.
> >
> > gcc/ChangeLog:
> >
> > PR target/98537
> > * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
> > When cmp is integer mask, convert it to vector.
> > (ix86_expand_int_vec_cmp): Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/98537
> > * g++.target/i386/avx512bw-pr98537-1.C: New test.
> > * g++.target/i386/avx512vl-pr98537-1.C: New test.
> > * g++.target/i386/avx512vl-pr98537-2.C: New test.
>
> Do we optimize it then to an AVX/AVX2 comparison if possible?
>
When i'm looking at the code, i find there's other places which
require comparison dest to be vector(i.e. ix86_expand_sse_unpack,
ix86_expand_mul_widen_evenodd). It's a potential bug.
So I fix this bug in another way which won't generate an integer mask
when the comparison dest is required to a vector mask.
Update patch:
ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
comparison, considering that avx512 introduces integer mask, but some
original callers require the dest of comparison to be a vector.
So add a new parameter vector_mask_p to control the result
of vector comparison to be vector or not.
regtested/bootstrapped on x86_64-linux-gnu{-m32,}.
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
parameter vector_mask_p to control whether the comparison
result should be a vector or not.
(ix86_expand_int_sse_cmp): Ditto.
(ix86_expand_sse_movcc): cmpmode should be MODE_INT.
(ix86_expand_fp_movcc): Allow vector comparison dest as
integer mask.
(ix86_expand_fp_vcond): Ditto.
(ix86_expand_int_vcond): Ditto.
(ix86_expand_fp_vec_cmp): Require vector comparison dest as
vector.
(ix86_expand_int_vec_cmp): Ditto.
(ix86_expand_sse_unpack): Ditto.
(ix86_expand_mul_widen_evenodd): Ditto.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
> @@ -4024,8 +4025,18 @@ ix86_expand_fp_vec_cmp (rtx operands[])
> cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
> operands[1], operands[2]);
>
> - if (operands[0] != cmp)
> - emit_move_insn (operands[0], cmp);
> + if (operands[0] != cmp)
> + {
>
> The indentation of the if above looks wrong.
> Otherwise LGTM.
>
> Jakub
>
--
BR,
Hongtao
[-- Attachment #2: 0001-Fix-ICE-Convert-integer-mask-to-vector-in-ix86_expan_V2.patch --]
[-- Type: text/x-patch, Size: 11539 bytes --]
From bae4500e17f7590a45504c8c9e3ab0fe6200681d Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Thu, 7 Jan 2021 10:15:33 +0800
Subject: [PATCH] Fix ICE: Convert integer mask to vector in
ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
comparison, considering that avx512 introduces integer mask, but some
original callers require the dest of comparison to be a vector.
So add a new parameter vector_mask_p to control the result
of vector comparison to be vector or not.
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
parameter vector_mask_p to control whether the comparison
result should be a vector or not.
(ix86_expand_int_sse_cmp): Ditto.
(ix86_expand_sse_movcc): cmpmode should be MODE_INT.
(ix86_expand_fp_movcc): Allow vector comparison dest as
integer mask.
(ix86_expand_fp_vcond): Ditto.
(ix86_expand_int_vcond): Ditto.
(ix86_expand_fp_vec_cmp): Require vector comparison dest as
vector.
(ix86_expand_int_vec_cmp): Ditto.
(ix86_expand_sse_unpack): Ditto.
(ix86_expand_mul_widen_evenodd): Ditto.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
---
gcc/config/i386/i386-expand.c | 63 ++++++++++---------
.../g++.target/i386/avx512bw-pr98537-1.C | 11 ++++
.../g++.target/i386/avx512vl-pr98537-1.C | 40 ++++++++++++
.../g++.target/i386/avx512vl-pr98537-2.C | 8 +++
4 files changed, 93 insertions(+), 29 deletions(-)
create mode 100644 gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 6e08fd32726..1e4ef3b9f3f 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3469,11 +3469,12 @@ ix86_valid_mask_cmp_mode (machine_mode mode)
return vector_size == 64 || TARGET_AVX512VL;
}
-/* Expand an SSE comparison. Return the register with the result. */
+/* Expand an SSE comparison. Return the register with the result.
+ If vector_mask_p is true, result of comparison should be a vector mask. */
static rtx
ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
- rtx op_true, rtx op_false)
+ rtx op_true, rtx op_false, bool vector_mask_p)
{
machine_mode mode = GET_MODE (dest);
machine_mode cmp_ops_mode = GET_MODE (cmp_op0);
@@ -3485,7 +3486,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
bool maskcmp = false;
rtx x;
- if (ix86_valid_mask_cmp_mode (cmp_ops_mode))
+ if (!vector_mask_p && ix86_valid_mask_cmp_mode (cmp_ops_mode))
{
unsigned int nbits = GET_MODE_NUNITS (cmp_ops_mode);
maskcmp = true;
@@ -3517,7 +3518,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
x = gen_rtx_fmt_ee (code, cmp_mode, cmp_op0, cmp_op1);
- if (cmp_mode != mode && !maskcmp)
+ if (cmp_mode != mode)
{
x = force_reg (cmp_ops_mode, x);
convert_move (dest, x, false);
@@ -3544,9 +3545,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
return;
}
- /* In AVX512F the result of comparison is an integer mask. */
- bool maskcmp = mode != cmpmode && ix86_valid_mask_cmp_mode (mode);
-
rtx t2, t3, x;
/* If we have an integer mask and FP value then we need
@@ -3557,7 +3555,10 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
cmp = gen_rtx_SUBREG (mode, cmp, 0);
}
- if (maskcmp)
+ /* In AVX512F the result of comparison is an integer mask. */
+ if (mode != cmpmode
+ && GET_MODE_CLASS (cmpmode) == MODE_INT
+ && ix86_valid_mask_cmp_mode (mode))
{
/* Using vector move with mask register. */
cmp = force_reg (cmpmode, cmp);
@@ -3846,7 +3847,7 @@ ix86_expand_fp_movcc (rtx operands[])
return true;
tmp = ix86_expand_sse_cmp (operands[0], code, op0, op1,
- operands[2], operands[3]);
+ operands[2], operands[3], false);
ix86_expand_sse_movcc (operands[0], tmp, operands[2], operands[3]);
return true;
}
@@ -4002,16 +4003,16 @@ ix86_expand_fp_vec_cmp (rtx operands[])
{
case LTGT:
temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
cmp = ix86_expand_sse_cmp (operands[0], NE, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
code = AND;
break;
case UNEQ:
temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
code = IOR;
break;
default:
@@ -4022,7 +4023,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
}
else
cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
- operands[1], operands[2]);
+ operands[1], operands[2], true);
if (operands[0] != cmp)
emit_move_insn (operands[0], cmp);
@@ -4032,7 +4033,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
static rtx
ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
- rtx op_true, rtx op_false, bool *negate)
+ rtx op_true, rtx op_false, bool *negate, bool vector_mask_p)
{
machine_mode data_mode = GET_MODE (dest);
machine_mode mode = GET_MODE (cop0);
@@ -4047,7 +4048,7 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
;
/* AVX512F supports all of the comparsions
on all 128/256/512-bit vector int types. */
- else if (ix86_valid_mask_cmp_mode (mode))
+ else if (!vector_mask_p && ix86_valid_mask_cmp_mode (mode))
;
else
{
@@ -4266,13 +4267,13 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
if (data_mode == mode)
{
x = ix86_expand_sse_cmp (dest, code, cop0, cop1,
- op_true, op_false);
+ op_true, op_false, vector_mask_p);
}
else
{
gcc_assert (GET_MODE_SIZE (data_mode) == GET_MODE_SIZE (mode));
x = ix86_expand_sse_cmp (gen_reg_rtx (mode), code, cop0, cop1,
- op_true, op_false);
+ op_true, op_false, vector_mask_p);
if (GET_MODE (x) == mode)
x = gen_lowpart (data_mode, x);
}
@@ -4288,7 +4289,7 @@ ix86_expand_int_vec_cmp (rtx operands[])
rtx_code code = GET_CODE (operands[1]);
bool negate = false;
rtx cmp = ix86_expand_int_sse_cmp (operands[0], code, operands[2],
- operands[3], NULL, NULL, &negate);
+ operands[3], NULL, NULL, &negate, true);
if (!cmp)
return false;
@@ -4296,7 +4297,7 @@ ix86_expand_int_vec_cmp (rtx operands[])
if (negate)
cmp = ix86_expand_int_sse_cmp (operands[0], EQ, cmp,
CONST0_RTX (GET_MODE (cmp)),
- NULL, NULL, &negate);
+ NULL, NULL, &negate, true);
gcc_assert (!negate);
@@ -4324,16 +4325,20 @@ ix86_expand_fp_vcond (rtx operands[])
{
case LTGT:
temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[4],
- operands[5], operands[0], operands[0]);
+ operands[5], operands[0], operands[0],
+ false);
cmp = ix86_expand_sse_cmp (operands[0], NE, operands[4],
- operands[5], operands[1], operands[2]);
+ operands[5], operands[1], operands[2],
+ false);
code = AND;
break;
case UNEQ:
temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[4],
- operands[5], operands[0], operands[0]);
+ operands[5], operands[0], operands[0],
+ false);
cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[4],
- operands[5], operands[1], operands[2]);
+ operands[5], operands[1], operands[2],
+ false);
code = IOR;
break;
default:
@@ -4350,7 +4355,7 @@ ix86_expand_fp_vcond (rtx operands[])
return true;
cmp = ix86_expand_sse_cmp (operands[0], code, operands[4], operands[5],
- operands[1], operands[2]);
+ operands[1], operands[2], false);
ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]);
return true;
}
@@ -4409,7 +4414,7 @@ ix86_expand_int_vcond (rtx operands[])
operands[2] = force_reg (data_mode, operands[2]);
x = ix86_expand_int_sse_cmp (operands[0], code, cop0, cop1,
- operands[1], operands[2], &negate);
+ operands[1], operands[2], &negate, false);
if (!x)
return false;
@@ -5076,7 +5081,7 @@ ix86_expand_sse_unpack (rtx dest, rtx src, bool unsigned_p, bool high_p)
tmp = force_reg (imode, CONST0_RTX (imode));
else
tmp = ix86_expand_sse_cmp (gen_reg_rtx (imode), GT, CONST0_RTX (imode),
- src, pc_rtx, pc_rtx);
+ src, pc_rtx, pc_rtx, true);
rtx tmp2 = gen_reg_rtx (imode);
emit_insn (unpack (tmp2, src, tmp));
@@ -20374,9 +20379,9 @@ ix86_expand_mul_widen_evenodd (rtx dest, rtx op1, rtx op2,
/* Compute the sign-extension, aka highparts, of the two operands. */
s1 = ix86_expand_sse_cmp (gen_reg_rtx (mode), GT, CONST0_RTX (mode),
- op1, pc_rtx, pc_rtx);
+ op1, pc_rtx, pc_rtx, true);
s2 = ix86_expand_sse_cmp (gen_reg_rtx (mode), GT, CONST0_RTX (mode),
- op2, pc_rtx, pc_rtx);
+ op2, pc_rtx, pc_rtx, true);
/* Multiply LO(A) * HI(B), and vice-versa. */
t1 = gen_reg_rtx (wmode);
diff --git a/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
new file mode 100644
index 00000000000..969684a222b
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
@@ -0,0 +1,11 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV char
+#define TYPEW short
+
+#define T_ARR \
+ __attribute__ ((target ("avx512vl,avx512bw")))
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
new file mode 100644
index 00000000000..b2ba91111da
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
@@ -0,0 +1,40 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#ifndef TYPEV
+#define TYPEV int
+#endif
+
+#ifndef TYPEW
+#define TYPEW long long
+#endif
+
+#ifndef T_ARR
+#define T_ARR \
+ __attribute__ ((target ("avx512vl")))
+#endif
+
+typedef TYPEV V __attribute__((__vector_size__(32)));
+typedef TYPEW W __attribute__((__vector_size__(32)));
+
+W c, d;
+struct B {};
+B e;
+struct C { W i; };
+void foo (C);
+
+C
+operator== (B, B)
+{
+ W r = (V)c == (V)d;
+ return {r};
+}
+
+void
+T_ARR
+bar ()
+{
+ B a;
+ foo (a == e);
+}
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
new file mode 100644
index 00000000000..42c9682746d
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
@@ -0,0 +1,8 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV float
+#define TYPEW double
+
+#include "avx512vl-pr98537-1.C"
--
2.18.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
2021-01-07 5:22 ` Hongtao Liu
@ 2021-01-14 11:16 ` Hongtao Liu
2021-01-26 5:30 ` Hongtao Liu
0 siblings, 1 reply; 8+ messages in thread
From: Hongtao Liu @ 2021-01-14 11:16 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: GCC Patches, Kirill Yukhin, H. J. Lu
ping.
On Thu, Jan 7, 2021 at 1:22 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote:
> > > ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
> > > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
> > > called in upper 2 functions.) may return integer mask whenever integer
> > > mask is available, so convert integer mask back to vector mask if
> > > needed.
> > >
> > > gcc/ChangeLog:
> > >
> > > PR target/98537
> > > * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
> > > When cmp is integer mask, convert it to vector.
> > > (ix86_expand_int_vec_cmp): Ditto.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR target/98537
> > > * g++.target/i386/avx512bw-pr98537-1.C: New test.
> > > * g++.target/i386/avx512vl-pr98537-1.C: New test.
> > > * g++.target/i386/avx512vl-pr98537-2.C: New test.
> >
> > Do we optimize it then to an AVX/AVX2 comparison if possible?
> >
> When i'm looking at the code, i find there's other places which
> require comparison dest to be vector(i.e. ix86_expand_sse_unpack,
> ix86_expand_mul_widen_evenodd). It's a potential bug.
> So I fix this bug in another way which won't generate an integer mask
> when the comparison dest is required to a vector mask.
>
> Update patch:
> ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
> comparison, considering that avx512 introduces integer mask, but some
> original callers require the dest of comparison to be a vector.
> So add a new parameter vector_mask_p to control the result
> of vector comparison to be vector or not.
> regtested/bootstrapped on x86_64-linux-gnu{-m32,}.
>
> gcc/ChangeLog:
>
> PR target/98537
> * config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
> parameter vector_mask_p to control whether the comparison
> result should be a vector or not.
> (ix86_expand_int_sse_cmp): Ditto.
> (ix86_expand_sse_movcc): cmpmode should be MODE_INT.
> (ix86_expand_fp_movcc): Allow vector comparison dest as
> integer mask.
> (ix86_expand_fp_vcond): Ditto.
> (ix86_expand_int_vcond): Ditto.
> (ix86_expand_fp_vec_cmp): Require vector comparison dest as
> vector.
> (ix86_expand_int_vec_cmp): Ditto.
> (ix86_expand_sse_unpack): Ditto.
> (ix86_expand_mul_widen_evenodd): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> PR target/98537
> * g++.target/i386/avx512bw-pr98537-1.C: New test.
> * g++.target/i386/avx512vl-pr98537-1.C: New test.
> * g++.target/i386/avx512vl-pr98537-2.C: New test.
>
>
> > @@ -4024,8 +4025,18 @@ ix86_expand_fp_vec_cmp (rtx operands[])
> > cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
> > operands[1], operands[2]);
> >
> > - if (operands[0] != cmp)
> > - emit_move_insn (operands[0], cmp);
> > + if (operands[0] != cmp)
> > + {
> >
> > The indentation of the if above looks wrong.
> > Otherwise LGTM.
> >
> > Jakub
> >
>
>
> --
> BR,
> Hongtao
--
BR,
Hongtao
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
2021-01-14 11:16 ` Hongtao Liu
@ 2021-01-26 5:30 ` Hongtao Liu
2021-02-04 5:31 ` Hongtao Liu
0 siblings, 1 reply; 8+ messages in thread
From: Hongtao Liu @ 2021-01-26 5:30 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: GCC Patches, Kirill Yukhin, H. J. Lu
[-- Attachment #1: Type: text/plain, Size: 3787 bytes --]
On Thu, Jan 14, 2021 at 7:16 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> ping.
>
> On Thu, Jan 7, 2021 at 1:22 PM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
> > >
> > > On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote:
> > > > ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
> > > > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
> > > > called in upper 2 functions.) may return integer mask whenever integer
> > > > mask is available, so convert integer mask back to vector mask if
> > > > needed.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > PR target/98537
> > > > * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
> > > > When cmp is integer mask, convert it to vector.
> > > > (ix86_expand_int_vec_cmp): Ditto.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR target/98537
> > > > * g++.target/i386/avx512bw-pr98537-1.C: New test.
> > > > * g++.target/i386/avx512vl-pr98537-1.C: New test.
> > > > * g++.target/i386/avx512vl-pr98537-2.C: New test.
> > >
> > > Do we optimize it then to an AVX/AVX2 comparison if possible?
A new patch is proposed to solve a series of performance and
correctness regressions brought by r10-5250.
Integer mask comparison will only be used for 512-bit vectors, and
128/256-bit vcondmn(excluding the case where op_true/op_false is all
1s or 0s, it is actually a vec_cmpmn).
in ix86_expand_sse_cmp/ix86_expand_int_sse_cmp
- if (ix86_valid_mask_cmp_mode (cmp_ops_mode))
+ if (GET_MODE_SIZE (mode) == 64
+ || (ix86_valid_mask_cmp_mode (cmp_ops_mode)
+ /* When op_true and op_false is NULL, vector dest is required. */
+ && op_true && op_false
+ /* Gimple sometimes transforms vec_cmpmn to vcondmn with
+ op_true/op_false as constm1_rtx/const0_rtx.
+ Don't generate integer mask comparison then. */
+ && !((vector_all_ones_operand (op_true, GET_MODE (op_true))
+ && CONST0_RTX (GET_MODE (op_false)) == op_false)
+ || (vector_all_ones_operand (op_false, GET_MODE (op_false))
+ && CONST0_RTX (GET_MODE (op_true)) == op_true))))
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
Ok for trunk?
Ok for backport to gcc10?
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Don't
generate integer mask comparison for 128/256-bits vector when
op_true/op_false is NULL_RTX or CONSTM1_RTX/CONST0_RTX. Also
delete redundant !maskcmp condition.
(ix86_expand_int_vec_cmp): Ditto but no redundant deletion
here.
(ix86_expand_sse_movcc): Delete definition of maskcmp, add the
condition directly to if (maskcmp), add extra check for
cmpmode, it should be MODE_INT.
(ix86_expand_fp_vec_cmp): Pass NULL to ix86_expand_sse_cmp's
parameters op_true/op_false.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
* gcc.target/i386/avx512vl-pr88547-1.c: Adjust testcase,
integer mask comparison should not be generated.
* gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: This test is
used to guard code generation of integer mask comparison, but
for vector comparison to vector dest, integer mask comparison
is disliked, so detele this useless test.
* gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c: Ditto.
--
BR,
Hongtao
[-- Attachment #2: 0001-Fix-ICE-Don-t-generate-integer-mask-comparision-for-.patch --]
[-- Type: text/x-patch, Size: 17384 bytes --]
From 2b648edd06daa432ce65397768e25ab54f470759 Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Thu, 7 Jan 2021 10:15:33 +0800
Subject: [PATCH] Fix ICE: Don't generate integer mask comparision for
128/256-bits vector when op_true/op_false are NULL or constm1_rtx/const0_rtx
[PR98537]
in ix86_expand_sse_cmp/ix86_expand_int_sse_cmp
- if (ix86_valid_mask_cmp_mode (cmp_ops_mode))
+ if (GET_MODE_SIZE (mode) == 64
+ || (ix86_valid_mask_cmp_mode (cmp_ops_mode)
+ /* When op_true and op_false is NULL, vector dest is required. */
+ && op_true && op_false
+ /* Gimple sometimes transforms vec_cmpmn to vcondmn with
+ op_true/op_false as constm1_rtx/const0_rtx.
+ Don't generate integer mask comparison then. */
+ && !((vector_all_ones_operand (op_true, GET_MODE (op_true))
+ && CONST0_RTX (GET_MODE (op_false)) == op_false)
+ || (vector_all_ones_operand (op_false, GET_MODE (op_false))
+ && CONST0_RTX (GET_MODE (op_true)) == op_true))))
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Don't
generate integer mask comparison for 128/256-bits vector when
op_true/op_false is NULL_RTX or CONSTM1_RTX/CONST0_RTX. Also
delete redundant !maskcmp condition.
(ix86_expand_int_vec_cmp): Ditto but no redundant deletion
here.
(ix86_expand_sse_movcc): Delete definition of maskcmp, add the
condition directly to if (maskcmp), add extra check for
cmpmode, it should be MODE_INT.
(ix86_expand_fp_vec_cmp): Pass NULL to ix86_expand_sse_cmp's
parameters op_true/op_false.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
* gcc.target/i386/avx512vl-pr88547-1.c: Adjust testcase,
integer mask comparison should not be generated.
* gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: This test is
used to guard code generation of integer mask comparison, but
for vector comparison to vector dest, integer mask comparison
is disliked, so detele this useless test.
* gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c: Ditto.
---
gcc/config/i386/i386-expand.c | 38 ++++--
.../g++.target/i386/avx512bw-pr98537-1.C | 11 ++
.../g++.target/i386/avx512vl-pr98537-1.C | 40 +++++++
.../g++.target/i386/avx512vl-pr98537-2.C | 8 ++
.../gcc.target/i386/avx512vl-pr88547-1.c | 10 +-
.../i386/avx512vl-pr92686-vpcmp-1.c | 112 ------------------
.../i386/avx512vl-pr92686-vpcmp-2.c | 91 --------------
.../i386/avx512vl-pr92686-vpcmp-intelasm-1.c | 111 -----------------
8 files changed, 94 insertions(+), 327 deletions(-)
create mode 100644 gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
delete mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c
delete mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c
delete mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index d64b4acc7dc..9758294e492 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3485,7 +3485,17 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
bool maskcmp = false;
rtx x;
- if (ix86_valid_mask_cmp_mode (cmp_ops_mode))
+ if (GET_MODE_SIZE (mode) == 64
+ || (ix86_valid_mask_cmp_mode (cmp_ops_mode)
+ /* When op_true and op_false is NULL, vector dest is required. */
+ && op_true && op_false
+ /* Gimple sometimes transforms vec_cmpmn to vcondmn with
+ op_true/op_false as constm1_rtx/const0_rtx.
+ Don't generate integer mask comparison then. */
+ && !((vector_all_ones_operand (op_true, GET_MODE (op_true))
+ && CONST0_RTX (GET_MODE (op_false)) == op_false)
+ || (vector_all_ones_operand (op_false, GET_MODE (op_false))
+ && CONST0_RTX (GET_MODE (op_true)) == op_true))))
{
unsigned int nbits = GET_MODE_NUNITS (cmp_ops_mode);
maskcmp = true;
@@ -3517,7 +3527,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
x = gen_rtx_fmt_ee (code, cmp_mode, cmp_op0, cmp_op1);
- if (cmp_mode != mode && !maskcmp)
+ if (cmp_mode != mode)
{
x = force_reg (cmp_ops_mode, x);
convert_move (dest, x, false);
@@ -3544,9 +3554,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
return;
}
- /* In AVX512F the result of comparison is an integer mask. */
- bool maskcmp = mode != cmpmode && ix86_valid_mask_cmp_mode (mode);
-
rtx t2, t3, x;
/* If we have an integer mask and FP value then we need
@@ -3557,7 +3564,10 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
cmp = gen_rtx_SUBREG (mode, cmp, 0);
}
- if (maskcmp)
+ /* In AVX512F the result of comparison is an integer mask. */
+ if (mode != cmpmode
+ && GET_MODE_CLASS (cmpmode) == MODE_INT
+ && ix86_valid_mask_cmp_mode (mode))
{
/* Using vector move with mask register. */
cmp = force_reg (cmpmode, cmp);
@@ -4016,7 +4026,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
}
else
cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
- operands[1], operands[2]);
+ NULL, NULL);
if (operands[0] != cmp)
emit_move_insn (operands[0], cmp);
@@ -4041,8 +4051,18 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
;
/* AVX512F supports all of the comparsions
on all 128/256/512-bit vector int types. */
- else if (ix86_valid_mask_cmp_mode (mode))
- ;
+ else if (GET_MODE_SIZE (mode) == 64
+ || (ix86_valid_mask_cmp_mode (mode)
+ /* When op_true and op_false is NULL, vector dest is required. */
+ && op_true && op_false
+ /* Gimple sometimes transforms vec_cmpmn to vcondmn with
+ op_true/op_false as constm1_rtx/const0_rtx.
+ Don't generate integer mask comparison then. */
+ && !((vector_all_ones_operand (op_true, GET_MODE (op_true))
+ && CONST0_RTX (GET_MODE (op_false)) == op_false)
+ || (vector_all_ones_operand (op_false, GET_MODE (op_false))
+ && CONST0_RTX (GET_MODE (op_true)) == op_true))))
+ ;
else
{
/* Canonicalize the comparison to EQ, GT, GTU. */
diff --git a/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
new file mode 100644
index 00000000000..969684a222b
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
@@ -0,0 +1,11 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV char
+#define TYPEW short
+
+#define T_ARR \
+ __attribute__ ((target ("avx512vl,avx512bw")))
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
new file mode 100644
index 00000000000..b2ba91111da
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
@@ -0,0 +1,40 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#ifndef TYPEV
+#define TYPEV int
+#endif
+
+#ifndef TYPEW
+#define TYPEW long long
+#endif
+
+#ifndef T_ARR
+#define T_ARR \
+ __attribute__ ((target ("avx512vl")))
+#endif
+
+typedef TYPEV V __attribute__((__vector_size__(32)));
+typedef TYPEW W __attribute__((__vector_size__(32)));
+
+W c, d;
+struct B {};
+B e;
+struct C { W i; };
+void foo (C);
+
+C
+operator== (B, B)
+{
+ W r = (V)c == (V)d;
+ return {r};
+}
+
+void
+T_ARR
+bar ()
+{
+ B a;
+ foo (a == e);
+}
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
new file mode 100644
index 00000000000..42c9682746d
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
@@ -0,0 +1,8 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV float
+#define TYPEW double
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c
index a3ffeca4354..af15a6364a4 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c
@@ -1,12 +1,14 @@
/* PR target/88547 */
/* { dg-do compile } */
-/* { dg-options "-O2 -mno-xop -mavx512vl -mno-avx512bw -mno-avx512dq" } */
+/* { dg-options "-O2 -mno-xop -mavx512vl -mavx512bw -mavx512dq" } */
/* { dg-final { scan-assembler-not "vpmingt\[bwdq]\[\t ]" } } */
+/* { dg-final { scan-assembler-not "%k\[0-9\]" } } */
/* { dg-final { scan-assembler-times "vpminub\[\t ]" 2 } } */
/* { dg-final { scan-assembler-times "vpminsb\[\t ]" 2 } } */
/* { dg-final { scan-assembler-times "vpminuw\[\t ]" 2 } } */
/* { dg-final { scan-assembler-times "vpminsw\[\t ]" 2 } } */
-/* { dg-final { scan-assembler-times "vpcmp\[dq\]\[\t ]" 4 } } */
-/* { dg-final { scan-assembler-times "vpcmpu\[dq\]\[\t ]" 4 } } */
-/* { dg-final { scan-assembler-times "vpternlog\[qd\]\[\t ]" 8 } } */
+/* { dg-final { scan-assembler-times "vpminud\[\t ]" 2 } } */
+/* { dg-final { scan-assembler-times "vpminsd\[\t ]" 2 } } */
+/* { dg-final { scan-assembler-times "vpminuq\[\t ]" 2 } } */
+/* { dg-final { scan-assembler-times "vpminsq\[\t ]" 2 } } */
#include "avx2-pr88547-1.c"
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c
deleted file mode 100644
index 5b79d4d36f9..00000000000
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c
+++ /dev/null
@@ -1,112 +0,0 @@
-/* PR target/88547 */
-/* { dg-do compile } */
-/* { dg-options "-O2 -mavx512bw -mavx512vl -mno-avx512dq -mno-xop" } */
-/* { dg-final { scan-assembler-times "vpcmp\[bwdq\]\[\t ]" 8 } } */
-/* { dg-final { scan-assembler-times "vpcmpu\[bwdq\]\[\t ]" 8 } } */
-/* { dg-final { scan-assembler-times "vpmovm2\[bw\]\[\t ]" 8 } } */
-/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[\t ]" 8 } } */
-
-typedef signed char v32qi __attribute__((vector_size(32)));
-typedef unsigned char v32uqi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef unsigned short v16uhi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef unsigned v8usi __attribute__((vector_size(32)));
-typedef long long v4di __attribute__((vector_size(32)));
-typedef unsigned long long v4udi __attribute__((vector_size(32)));
-
-__attribute__((noipa)) v32qi
-f1 (v32qi x, v32qi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32uqi
-f2 (v32uqi x, v32uqi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32qi
-f3 (v32qi x, v32qi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v32uqi
-f4 (v32uqi x, v32uqi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16hi
-f5 (v16hi x, v16hi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16uhi
-f6 (v16uhi x, v16uhi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16hi
-f7 (v16hi x, v16hi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16uhi
-f8 (v16uhi x, v16uhi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8si
-f9 (v8si x, v8si y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8usi
-f10 (v8usi x, v8usi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8si
-f11 (v8si x, v8si y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8usi
-f12 (v8usi x, v8usi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4di
-f13 (v4di x, v4di y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4udi
-f14 (v4udi x, v4udi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4di
-f15 (v4di x, v4di y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4udi
-f16 (v4udi x, v4udi y)
-{
- return x <= y;
-}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c
deleted file mode 100644
index 6be24ff30f4..00000000000
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c
+++ /dev/null
@@ -1,91 +0,0 @@
-/* { dg-do run } */
-/* { dg-require-effective-target avx512bw } */
-/* { dg-require-effective-target avx512vl } */
-/* { dg-options "-O2 -mavx512bw -mavx512vl" } */
-
-#ifndef CHECK
-#define CHECK "avx512f-helper.h"
-#endif
-
-#include CHECK
-
-#ifndef TEST
-#define TEST avx512vl_test
-#endif
-
-#include "avx512vl-pr92686-vpcmp-1.c"
-
-#define NUM 256
-
-#define TEST_SIGNED(vtype, type, N, fn, op) \
-do \
- { \
- union { vtype x[NUM / N]; type i[NUM]; } dst, src1, src2; \
- int i, sign = 1; \
- type res; \
- for (i = 0; i < NUM; i++) \
- { \
- src1.i[i] = i * i * sign; \
- src2.i[i] = (i + 20) * sign; \
- sign = -sign; \
- } \
- for (i = 0; i < NUM; i += N) \
- dst.x[i / N] = fn (src1.x[i / N], src2.x[i / N]); \
- \
- for (i = 0; i < NUM; i++) \
- { \
- res = src1.i[i] op src2.i[i] ? -1 : 0; \
- if (res != dst.i[i]) \
- abort (); \
- } \
- } \
-while (0)
-
-#define TEST_UNSIGNED(vtype, type, N, fn, op) \
-do \
- { \
- union { vtype x[NUM / N]; type i[NUM]; } dst, src1, src2; \
- int i; \
- type res; \
- \
- for (i = 0; i < NUM; i++) \
- { \
- src1.i[i] = i * i; \
- src2.i[i] = i + 20; \
- if ((i % 4)) \
- src2.i[i] |= (1ULL << (sizeof (type) \
- * __CHAR_BIT__ - 1)); \
- } \
- \
- for (i = 0; i < NUM; i += N) \
- dst.x[i / N] = fn (src1.x[i / N], src2.x[i / N]); \
- \
- for (i = 0; i < NUM; i++) \
- { \
- res = src1.i[i] op src2.i[i] ? -1 : 0; \
- if (res != dst.i[i]) \
- abort (); \
- } \
- } \
-while (0)
-
-static void
-TEST (void)
-{
- TEST_SIGNED (v32qi, signed char, 32, f1, >=);
- TEST_UNSIGNED (v32uqi, unsigned char, 32, f2, >=);
- TEST_SIGNED (v32qi, signed char, 32, f3, <=);
- TEST_UNSIGNED (v32uqi, unsigned char, 32, f4, <=);
- TEST_SIGNED (v16hi, short int, 16, f5, >=);
- TEST_UNSIGNED (v16uhi, unsigned short int, 16, f6, >=);
- TEST_SIGNED (v16hi, short int, 16, f7, <=);
- TEST_UNSIGNED (v16uhi, unsigned short int, 16, f8, <=);
- TEST_SIGNED (v8si, int, 8, f9, >=);
- TEST_UNSIGNED (v8usi, unsigned int, 8, f10, >=);
- TEST_SIGNED (v8si, int, 8, f11, <=);
- TEST_UNSIGNED (v8usi, unsigned int, 8, f12, <=);
- TEST_SIGNED (v4di, long long int, 4, f13, >=);
- TEST_UNSIGNED (v4udi, unsigned long long int, 4, f14, >=);
- TEST_SIGNED (v4di, long long int, 4, f15, <=);
- TEST_UNSIGNED (v4udi, unsigned long long int, 4, f16, <=);
-}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c
deleted file mode 100644
index 907386db08b..00000000000
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c
+++ /dev/null
@@ -1,111 +0,0 @@
-/* PR target/88547 */
-/* { dg-do assemble } */
-/* { dg-require-effective-target masm_intel } */
-/* { dg-options "-O2 -mavx512bw -mavx512vl -mno-avx512dq -mno-xop -masm=intel" } */
-/* { dg-require-effective-target avx512bw } */
-/* { dg-require-effective-target avx512vl } */
-
-typedef signed char v32qi __attribute__((vector_size(32)));
-typedef unsigned char v32uqi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef unsigned short v16uhi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef unsigned v8usi __attribute__((vector_size(32)));
-typedef long long v4di __attribute__((vector_size(32)));
-typedef unsigned long long v4udi __attribute__((vector_size(32)));
-
-__attribute__((noipa)) v32qi
-f1 (v32qi x, v32qi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32uqi
-f2 (v32uqi x, v32uqi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32qi
-f3 (v32qi x, v32qi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v32uqi
-f4 (v32uqi x, v32uqi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16hi
-f5 (v16hi x, v16hi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16uhi
-f6 (v16uhi x, v16uhi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16hi
-f7 (v16hi x, v16hi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16uhi
-f8 (v16uhi x, v16uhi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8si
-f9 (v8si x, v8si y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8usi
-f10 (v8usi x, v8usi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8si
-f11 (v8si x, v8si y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8usi
-f12 (v8usi x, v8usi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4di
-f13 (v4di x, v4di y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4udi
-f14 (v4udi x, v4udi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4di
-f15 (v4di x, v4di y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4udi
-f16 (v4udi x, v4udi y)
-{
- return x <= y;
-}
--
2.18.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
2021-01-26 5:30 ` Hongtao Liu
@ 2021-02-04 5:31 ` Hongtao Liu
2021-02-04 12:00 ` Jakub Jelinek
0 siblings, 1 reply; 8+ messages in thread
From: Hongtao Liu @ 2021-02-04 5:31 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: GCC Patches, Kirill Yukhin, H. J. Lu
[-- Attachment #1: Type: text/plain, Size: 1771 bytes --]
Rebase and update patch:
Fix ICE: Don't generate integer mask comparison for 128/256-bits
vector when op_true/op_false are NULL or constm1_rtx/const0_rtx
[PR98537]
in ix86_expand_sse_cmp/ix86_expand_int_sse_cmp
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Don't
generate integer mask comparison for 128/256-bits vector when
op_true/op_false is NULL_RTX or CONSTM1_RTX/CONST0_RTX. Also
delete redundant !maskcmp condition.
(ix86_expand_int_vec_cmp): Ditto but no redundant deletion
here.
(ix86_expand_sse_movcc): Delete definition of maskcmp, add the
condition directly to if (maskcmp), add extra check for
cmpmode, it should be MODE_INT.
(ix86_expand_fp_vec_cmp): Pass NULL to ix86_expand_sse_cmp's
parameters op_true/op_false.
(ix86_use_mask_cmp_p): New.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
* gcc.target/i386/avx512vl-pr88547-1.c: Adjust testcase,
integer mask comparison should not be generated.
* gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: This test is
used to guard code generation of integer mask comparison, but
for vector comparison to vector dest, integer mask comparison
is disliked, so detele this useless test.
* gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c: Ditto.
--
BR,
Hongtao
[-- Attachment #2: 0001-Fix-ICE-Don-t-generate-integer-mask-comparision-for-.patch --]
[-- Type: application/octet-stream, Size: 16740 bytes --]
From 14e7303b54aa1db81b0ef3db27b9309445adc842 Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Thu, 7 Jan 2021 10:15:33 +0800
Subject: [PATCH] Fix ICE: Don't generate integer mask comparision for
128/256-bits vector when op_true/op_false are NULL or constm1_rtx/const0_rtx
[PR98537]
in ix86_expand_sse_cmp/ix86_expand_int_sse_cmp
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Don't
generate integer mask comparison for 128/256-bits vector when
op_true/op_false is NULL_RTX or CONSTM1_RTX/CONST0_RTX. Also
delete redundant !maskcmp condition.
(ix86_expand_int_vec_cmp): Ditto but no redundant deletion
here.
(ix86_expand_sse_movcc): Delete definition of maskcmp, add the
condition directly to if (maskcmp), add extra check for
cmpmode, it should be MODE_INT.
(ix86_expand_fp_vec_cmp): Pass NULL to ix86_expand_sse_cmp's
parameters op_true/op_false.
(ix86_use_mask_cmp_p): New.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
* gcc.target/i386/avx512vl-pr88547-1.c: Adjust testcase,
integer mask comparison should not be generated.
* gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: This test is
used to guard code generation of integer mask comparison, but
for vector comparison to vector dest, integer mask comparison
is disliked, so detele this useless test.
* gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c: Ditto.
---
gcc/config/i386/i386-expand.c | 43 +++++--
.../g++.target/i386/avx512bw-pr98537-1.C | 11 ++
.../g++.target/i386/avx512vl-pr98537-1.C | 40 +++++++
.../g++.target/i386/avx512vl-pr98537-2.C | 8 ++
.../gcc.target/i386/avx512vl-pr88547-1.c | 10 +-
.../i386/avx512vl-pr92686-vpcmp-1.c | 112 ------------------
.../i386/avx512vl-pr92686-vpcmp-2.c | 91 --------------
.../i386/avx512vl-pr92686-vpcmp-intelasm-1.c | 111 -----------------
8 files changed, 100 insertions(+), 326 deletions(-)
create mode 100644 gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
delete mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c
delete mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c
delete mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index d64b4acc7dc..0464944b9d5 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3469,6 +3469,33 @@ ix86_valid_mask_cmp_mode (machine_mode mode)
return vector_size == 64 || TARGET_AVX512VL;
}
+/* Return true if integer mask comparison should be used. */
+static bool
+ix86_use_mask_cmp_p (machine_mode mode, machine_mode cmp_mode,
+ rtx op_true, rtx op_false)
+{
+ if (GET_MODE_SIZE (mode) == 64)
+ return true;
+
+ /* When op_true is NULL, op_flase must be NULL, vice either. */
+ gcc_assert (!op_true == !op_false);
+
+ /* When op_true/op_false is NULL or cmp_mode is not valid mask cmp mode,
+ vector dest is required. */
+ if (!op_true || !ix86_valid_mask_cmp_mode (cmp_mode))
+ return false;
+
+ /* Exclude those could be optimized in ix86_expand_sse_movcc. */
+ if (op_false == CONST0_RTX (mode)
+ || op_true == CONST0_RTX (mode)
+ || (INTEGRAL_MODE_P (mode)
+ && (op_true == CONSTM1_RTX (mode)
+ || op_false == CONSTM1_RTX (mode))))
+ return false;
+
+ return true;
+}
+
/* Expand an SSE comparison. Return the register with the result. */
static rtx
@@ -3485,7 +3512,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
bool maskcmp = false;
rtx x;
- if (ix86_valid_mask_cmp_mode (cmp_ops_mode))
+ if (ix86_use_mask_cmp_p (mode, cmp_ops_mode, op_true, op_false))
{
unsigned int nbits = GET_MODE_NUNITS (cmp_ops_mode);
maskcmp = true;
@@ -3517,7 +3544,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
x = gen_rtx_fmt_ee (code, cmp_mode, cmp_op0, cmp_op1);
- if (cmp_mode != mode && !maskcmp)
+ if (cmp_mode != mode)
{
x = force_reg (cmp_ops_mode, x);
convert_move (dest, x, false);
@@ -3544,9 +3571,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
return;
}
- /* In AVX512F the result of comparison is an integer mask. */
- bool maskcmp = mode != cmpmode && ix86_valid_mask_cmp_mode (mode);
-
rtx t2, t3, x;
/* If we have an integer mask and FP value then we need
@@ -3557,8 +3581,11 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
cmp = gen_rtx_SUBREG (mode, cmp, 0);
}
- if (maskcmp)
+ /* In AVX512F the result of comparison is an integer mask. */
+ if (mode != cmpmode
+ && GET_MODE_CLASS (cmpmode) == MODE_INT)
{
+ gcc_assert (ix86_valid_mask_cmp_mode (mode));
/* Using vector move with mask register. */
cmp = force_reg (cmpmode, cmp);
/* Optimize for mask zero. */
@@ -4016,7 +4043,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
}
else
cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
- operands[1], operands[2]);
+ NULL, NULL);
if (operands[0] != cmp)
emit_move_insn (operands[0], cmp);
@@ -4041,7 +4068,7 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
;
/* AVX512F supports all of the comparsions
on all 128/256/512-bit vector int types. */
- else if (ix86_valid_mask_cmp_mode (mode))
+ else if (ix86_use_mask_cmp_p (data_mode, mode, op_true, op_false))
;
else
{
diff --git a/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
new file mode 100644
index 00000000000..969684a222b
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
@@ -0,0 +1,11 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV char
+#define TYPEW short
+
+#define T_ARR \
+ __attribute__ ((target ("avx512vl,avx512bw")))
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
new file mode 100644
index 00000000000..b2ba91111da
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
@@ -0,0 +1,40 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#ifndef TYPEV
+#define TYPEV int
+#endif
+
+#ifndef TYPEW
+#define TYPEW long long
+#endif
+
+#ifndef T_ARR
+#define T_ARR \
+ __attribute__ ((target ("avx512vl")))
+#endif
+
+typedef TYPEV V __attribute__((__vector_size__(32)));
+typedef TYPEW W __attribute__((__vector_size__(32)));
+
+W c, d;
+struct B {};
+B e;
+struct C { W i; };
+void foo (C);
+
+C
+operator== (B, B)
+{
+ W r = (V)c == (V)d;
+ return {r};
+}
+
+void
+T_ARR
+bar ()
+{
+ B a;
+ foo (a == e);
+}
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
new file mode 100644
index 00000000000..42c9682746d
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
@@ -0,0 +1,8 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV float
+#define TYPEW double
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c
index a3ffeca4354..af15a6364a4 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c
@@ -1,12 +1,14 @@
/* PR target/88547 */
/* { dg-do compile } */
-/* { dg-options "-O2 -mno-xop -mavx512vl -mno-avx512bw -mno-avx512dq" } */
+/* { dg-options "-O2 -mno-xop -mavx512vl -mavx512bw -mavx512dq" } */
/* { dg-final { scan-assembler-not "vpmingt\[bwdq]\[\t ]" } } */
+/* { dg-final { scan-assembler-not "%k\[0-9\]" } } */
/* { dg-final { scan-assembler-times "vpminub\[\t ]" 2 } } */
/* { dg-final { scan-assembler-times "vpminsb\[\t ]" 2 } } */
/* { dg-final { scan-assembler-times "vpminuw\[\t ]" 2 } } */
/* { dg-final { scan-assembler-times "vpminsw\[\t ]" 2 } } */
-/* { dg-final { scan-assembler-times "vpcmp\[dq\]\[\t ]" 4 } } */
-/* { dg-final { scan-assembler-times "vpcmpu\[dq\]\[\t ]" 4 } } */
-/* { dg-final { scan-assembler-times "vpternlog\[qd\]\[\t ]" 8 } } */
+/* { dg-final { scan-assembler-times "vpminud\[\t ]" 2 } } */
+/* { dg-final { scan-assembler-times "vpminsd\[\t ]" 2 } } */
+/* { dg-final { scan-assembler-times "vpminuq\[\t ]" 2 } } */
+/* { dg-final { scan-assembler-times "vpminsq\[\t ]" 2 } } */
#include "avx2-pr88547-1.c"
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c
deleted file mode 100644
index 5b79d4d36f9..00000000000
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-1.c
+++ /dev/null
@@ -1,112 +0,0 @@
-/* PR target/88547 */
-/* { dg-do compile } */
-/* { dg-options "-O2 -mavx512bw -mavx512vl -mno-avx512dq -mno-xop" } */
-/* { dg-final { scan-assembler-times "vpcmp\[bwdq\]\[\t ]" 8 } } */
-/* { dg-final { scan-assembler-times "vpcmpu\[bwdq\]\[\t ]" 8 } } */
-/* { dg-final { scan-assembler-times "vpmovm2\[bw\]\[\t ]" 8 } } */
-/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[\t ]" 8 } } */
-
-typedef signed char v32qi __attribute__((vector_size(32)));
-typedef unsigned char v32uqi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef unsigned short v16uhi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef unsigned v8usi __attribute__((vector_size(32)));
-typedef long long v4di __attribute__((vector_size(32)));
-typedef unsigned long long v4udi __attribute__((vector_size(32)));
-
-__attribute__((noipa)) v32qi
-f1 (v32qi x, v32qi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32uqi
-f2 (v32uqi x, v32uqi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32qi
-f3 (v32qi x, v32qi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v32uqi
-f4 (v32uqi x, v32uqi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16hi
-f5 (v16hi x, v16hi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16uhi
-f6 (v16uhi x, v16uhi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16hi
-f7 (v16hi x, v16hi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16uhi
-f8 (v16uhi x, v16uhi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8si
-f9 (v8si x, v8si y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8usi
-f10 (v8usi x, v8usi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8si
-f11 (v8si x, v8si y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8usi
-f12 (v8usi x, v8usi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4di
-f13 (v4di x, v4di y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4udi
-f14 (v4udi x, v4udi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4di
-f15 (v4di x, v4di y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4udi
-f16 (v4udi x, v4udi y)
-{
- return x <= y;
-}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c
deleted file mode 100644
index 6be24ff30f4..00000000000
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-2.c
+++ /dev/null
@@ -1,91 +0,0 @@
-/* { dg-do run } */
-/* { dg-require-effective-target avx512bw } */
-/* { dg-require-effective-target avx512vl } */
-/* { dg-options "-O2 -mavx512bw -mavx512vl" } */
-
-#ifndef CHECK
-#define CHECK "avx512f-helper.h"
-#endif
-
-#include CHECK
-
-#ifndef TEST
-#define TEST avx512vl_test
-#endif
-
-#include "avx512vl-pr92686-vpcmp-1.c"
-
-#define NUM 256
-
-#define TEST_SIGNED(vtype, type, N, fn, op) \
-do \
- { \
- union { vtype x[NUM / N]; type i[NUM]; } dst, src1, src2; \
- int i, sign = 1; \
- type res; \
- for (i = 0; i < NUM; i++) \
- { \
- src1.i[i] = i * i * sign; \
- src2.i[i] = (i + 20) * sign; \
- sign = -sign; \
- } \
- for (i = 0; i < NUM; i += N) \
- dst.x[i / N] = fn (src1.x[i / N], src2.x[i / N]); \
- \
- for (i = 0; i < NUM; i++) \
- { \
- res = src1.i[i] op src2.i[i] ? -1 : 0; \
- if (res != dst.i[i]) \
- abort (); \
- } \
- } \
-while (0)
-
-#define TEST_UNSIGNED(vtype, type, N, fn, op) \
-do \
- { \
- union { vtype x[NUM / N]; type i[NUM]; } dst, src1, src2; \
- int i; \
- type res; \
- \
- for (i = 0; i < NUM; i++) \
- { \
- src1.i[i] = i * i; \
- src2.i[i] = i + 20; \
- if ((i % 4)) \
- src2.i[i] |= (1ULL << (sizeof (type) \
- * __CHAR_BIT__ - 1)); \
- } \
- \
- for (i = 0; i < NUM; i += N) \
- dst.x[i / N] = fn (src1.x[i / N], src2.x[i / N]); \
- \
- for (i = 0; i < NUM; i++) \
- { \
- res = src1.i[i] op src2.i[i] ? -1 : 0; \
- if (res != dst.i[i]) \
- abort (); \
- } \
- } \
-while (0)
-
-static void
-TEST (void)
-{
- TEST_SIGNED (v32qi, signed char, 32, f1, >=);
- TEST_UNSIGNED (v32uqi, unsigned char, 32, f2, >=);
- TEST_SIGNED (v32qi, signed char, 32, f3, <=);
- TEST_UNSIGNED (v32uqi, unsigned char, 32, f4, <=);
- TEST_SIGNED (v16hi, short int, 16, f5, >=);
- TEST_UNSIGNED (v16uhi, unsigned short int, 16, f6, >=);
- TEST_SIGNED (v16hi, short int, 16, f7, <=);
- TEST_UNSIGNED (v16uhi, unsigned short int, 16, f8, <=);
- TEST_SIGNED (v8si, int, 8, f9, >=);
- TEST_UNSIGNED (v8usi, unsigned int, 8, f10, >=);
- TEST_SIGNED (v8si, int, 8, f11, <=);
- TEST_UNSIGNED (v8usi, unsigned int, 8, f12, <=);
- TEST_SIGNED (v4di, long long int, 4, f13, >=);
- TEST_UNSIGNED (v4udi, unsigned long long int, 4, f14, >=);
- TEST_SIGNED (v4di, long long int, 4, f15, <=);
- TEST_UNSIGNED (v4udi, unsigned long long int, 4, f16, <=);
-}
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c
deleted file mode 100644
index 907386db08b..00000000000
--- a/gcc/testsuite/gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c
+++ /dev/null
@@ -1,111 +0,0 @@
-/* PR target/88547 */
-/* { dg-do assemble } */
-/* { dg-require-effective-target masm_intel } */
-/* { dg-options "-O2 -mavx512bw -mavx512vl -mno-avx512dq -mno-xop -masm=intel" } */
-/* { dg-require-effective-target avx512bw } */
-/* { dg-require-effective-target avx512vl } */
-
-typedef signed char v32qi __attribute__((vector_size(32)));
-typedef unsigned char v32uqi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef unsigned short v16uhi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef unsigned v8usi __attribute__((vector_size(32)));
-typedef long long v4di __attribute__((vector_size(32)));
-typedef unsigned long long v4udi __attribute__((vector_size(32)));
-
-__attribute__((noipa)) v32qi
-f1 (v32qi x, v32qi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32uqi
-f2 (v32uqi x, v32uqi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v32qi
-f3 (v32qi x, v32qi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v32uqi
-f4 (v32uqi x, v32uqi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16hi
-f5 (v16hi x, v16hi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16uhi
-f6 (v16uhi x, v16uhi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v16hi
-f7 (v16hi x, v16hi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v16uhi
-f8 (v16uhi x, v16uhi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8si
-f9 (v8si x, v8si y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8usi
-f10 (v8usi x, v8usi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v8si
-f11 (v8si x, v8si y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v8usi
-f12 (v8usi x, v8usi y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4di
-f13 (v4di x, v4di y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4udi
-f14 (v4udi x, v4udi y)
-{
- return x >= y;
-}
-
-__attribute__((noipa)) v4di
-f15 (v4di x, v4di y)
-{
- return x <= y;
-}
-
-__attribute__((noipa)) v4udi
-f16 (v4udi x, v4udi y)
-{
- return x <= y;
-}
--
2.18.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
2021-02-04 5:31 ` Hongtao Liu
@ 2021-02-04 12:00 ` Jakub Jelinek
2021-02-05 1:52 ` Hongtao Liu
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2021-02-04 12:00 UTC (permalink / raw)
To: Hongtao Liu; +Cc: GCC Patches
On Thu, Feb 04, 2021 at 01:31:52PM +0800, Hongtao Liu via Gcc-patches wrote:
* gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: This test is
used to guard code generation of integer mask comparison, but
for vector comparison to vector dest, integer mask comparison
is disliked, so detele this useless test.
* gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c: Ditto.
s/detele/delete/; but I'd say just write : Remove.
for all 3 tests, the explanation should go into the commit message, not
ChangeLog.
+ /* When op_true is NULL, op_flase must be NULL, vice either. */
s/flase/false/
s/vice either/or vice versa/
+ gcc_assert (!op_true == !op_false);
+
+ /* When op_true/op_false is NULL or cmp_mode is not valid mask cmp mode,
+ vector dest is required. */
+ if (!op_true || !ix86_valid_mask_cmp_mode (cmp_mode))
+ return false;
+
+ /* Exclude those could be optimized in ix86_expand_sse_movcc. */
s/those/those that/
Otherwise LGTM.
Jakub
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
2021-02-04 12:00 ` Jakub Jelinek
@ 2021-02-05 1:52 ` Hongtao Liu
0 siblings, 0 replies; 8+ messages in thread
From: Hongtao Liu @ 2021-02-05 1:52 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: GCC Patches
On Thu, Feb 4, 2021 at 8:00 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Feb 04, 2021 at 01:31:52PM +0800, Hongtao Liu via Gcc-patches wrote:
> * gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: This test is
> used to guard code generation of integer mask comparison, but
> for vector comparison to vector dest, integer mask comparison
> is disliked, so detele this useless test.
> * gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
> * gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c: Ditto.
>
> s/detele/delete/; but I'd say just write : Remove.
> for all 3 tests, the explanation should go into the commit message, not
> ChangeLog.
> + /* When op_true is NULL, op_flase must be NULL, vice either. */
>
> s/flase/false/
> s/vice either/or vice versa/
>
> + gcc_assert (!op_true == !op_false);
> +
> + /* When op_true/op_false is NULL or cmp_mode is not valid mask cmp mode,
> + vector dest is required. */
> + if (!op_true || !ix86_valid_mask_cmp_mode (cmp_mode))
> + return false;
> +
> + /* Exclude those could be optimized in ix86_expand_sse_movcc. */
>
> s/those/those that/
>
> Otherwise LGTM.
>
Ok for backport?
> Jakub
>
--
BR,
Hongtao
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-02-05 1:53 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-06 6:49 [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537] Hongtao Liu
2021-01-06 14:39 ` Jakub Jelinek
2021-01-07 5:22 ` Hongtao Liu
2021-01-14 11:16 ` Hongtao Liu
2021-01-26 5:30 ` Hongtao Liu
2021-02-04 5:31 ` Hongtao Liu
2021-02-04 12:00 ` Jakub Jelinek
2021-02-05 1:52 ` Hongtao Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).