From: Hongtao Liu <crazylht@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Kirill Yukhin <kirill.yukhin@gmail.com>,
"H. J. Lu" <hjl.tools@gmail.com>
Subject: Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
Date: Thu, 7 Jan 2021 13:22:00 +0800 [thread overview]
Message-ID: <CAMZc-bxQAL8RPp14nPCVEo+9TGmNUxf42LFDRwFsri2YX2O=9A@mail.gmail.com> (raw)
In-Reply-To: <20210106143931.GC725145@tucnak>
[-- Attachment #1: Type: text/plain, Size: 3017 bytes --]
On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote:
> > ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
> > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
> > called in upper 2 functions.) may return integer mask whenever integer
> > mask is available, so convert integer mask back to vector mask if
> > needed.
> >
> > gcc/ChangeLog:
> >
> > PR target/98537
> > * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
> > When cmp is integer mask, convert it to vector.
> > (ix86_expand_int_vec_cmp): Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/98537
> > * g++.target/i386/avx512bw-pr98537-1.C: New test.
> > * g++.target/i386/avx512vl-pr98537-1.C: New test.
> > * g++.target/i386/avx512vl-pr98537-2.C: New test.
>
> Do we optimize it then to an AVX/AVX2 comparison if possible?
>
When i'm looking at the code, i find there's other places which
require comparison dest to be vector(i.e. ix86_expand_sse_unpack,
ix86_expand_mul_widen_evenodd). It's a potential bug.
So I fix this bug in another way which won't generate an integer mask
when the comparison dest is required to a vector mask.
Update patch:
ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
comparison, considering that avx512 introduces integer mask, but some
original callers require the dest of comparison to be a vector.
So add a new parameter vector_mask_p to control the result
of vector comparison to be vector or not.
regtested/bootstrapped on x86_64-linux-gnu{-m32,}.
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
parameter vector_mask_p to control whether the comparison
result should be a vector or not.
(ix86_expand_int_sse_cmp): Ditto.
(ix86_expand_sse_movcc): cmpmode should be MODE_INT.
(ix86_expand_fp_movcc): Allow vector comparison dest as
integer mask.
(ix86_expand_fp_vcond): Ditto.
(ix86_expand_int_vcond): Ditto.
(ix86_expand_fp_vec_cmp): Require vector comparison dest as
vector.
(ix86_expand_int_vec_cmp): Ditto.
(ix86_expand_sse_unpack): Ditto.
(ix86_expand_mul_widen_evenodd): Ditto.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
> @@ -4024,8 +4025,18 @@ ix86_expand_fp_vec_cmp (rtx operands[])
> cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
> operands[1], operands[2]);
>
> - if (operands[0] != cmp)
> - emit_move_insn (operands[0], cmp);
> + if (operands[0] != cmp)
> + {
>
> The indentation of the if above looks wrong.
> Otherwise LGTM.
>
> Jakub
>
--
BR,
Hongtao
[-- Attachment #2: 0001-Fix-ICE-Convert-integer-mask-to-vector-in-ix86_expan_V2.patch --]
[-- Type: text/x-patch, Size: 11539 bytes --]
From bae4500e17f7590a45504c8c9e3ab0fe6200681d Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Thu, 7 Jan 2021 10:15:33 +0800
Subject: [PATCH] Fix ICE: Convert integer mask to vector in
ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
comparison, considering that avx512 introduces integer mask, but some
original callers require the dest of comparison to be a vector.
So add a new parameter vector_mask_p to control the result
of vector comparison to be vector or not.
gcc/ChangeLog:
PR target/98537
* config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
parameter vector_mask_p to control whether the comparison
result should be a vector or not.
(ix86_expand_int_sse_cmp): Ditto.
(ix86_expand_sse_movcc): cmpmode should be MODE_INT.
(ix86_expand_fp_movcc): Allow vector comparison dest as
integer mask.
(ix86_expand_fp_vcond): Ditto.
(ix86_expand_int_vcond): Ditto.
(ix86_expand_fp_vec_cmp): Require vector comparison dest as
vector.
(ix86_expand_int_vec_cmp): Ditto.
(ix86_expand_sse_unpack): Ditto.
(ix86_expand_mul_widen_evenodd): Ditto.
gcc/testsuite/ChangeLog:
PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.
---
gcc/config/i386/i386-expand.c | 63 ++++++++++---------
.../g++.target/i386/avx512bw-pr98537-1.C | 11 ++++
.../g++.target/i386/avx512vl-pr98537-1.C | 40 ++++++++++++
.../g++.target/i386/avx512vl-pr98537-2.C | 8 +++
4 files changed, 93 insertions(+), 29 deletions(-)
create mode 100644 gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 6e08fd32726..1e4ef3b9f3f 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3469,11 +3469,12 @@ ix86_valid_mask_cmp_mode (machine_mode mode)
return vector_size == 64 || TARGET_AVX512VL;
}
-/* Expand an SSE comparison. Return the register with the result. */
+/* Expand an SSE comparison. Return the register with the result.
+ If vector_mask_p is true, result of comparison should be a vector mask. */
static rtx
ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
- rtx op_true, rtx op_false)
+ rtx op_true, rtx op_false, bool vector_mask_p)
{
machine_mode mode = GET_MODE (dest);
machine_mode cmp_ops_mode = GET_MODE (cmp_op0);
@@ -3485,7 +3486,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
bool maskcmp = false;
rtx x;
- if (ix86_valid_mask_cmp_mode (cmp_ops_mode))
+ if (!vector_mask_p && ix86_valid_mask_cmp_mode (cmp_ops_mode))
{
unsigned int nbits = GET_MODE_NUNITS (cmp_ops_mode);
maskcmp = true;
@@ -3517,7 +3518,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
x = gen_rtx_fmt_ee (code, cmp_mode, cmp_op0, cmp_op1);
- if (cmp_mode != mode && !maskcmp)
+ if (cmp_mode != mode)
{
x = force_reg (cmp_ops_mode, x);
convert_move (dest, x, false);
@@ -3544,9 +3545,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
return;
}
- /* In AVX512F the result of comparison is an integer mask. */
- bool maskcmp = mode != cmpmode && ix86_valid_mask_cmp_mode (mode);
-
rtx t2, t3, x;
/* If we have an integer mask and FP value then we need
@@ -3557,7 +3555,10 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
cmp = gen_rtx_SUBREG (mode, cmp, 0);
}
- if (maskcmp)
+ /* In AVX512F the result of comparison is an integer mask. */
+ if (mode != cmpmode
+ && GET_MODE_CLASS (cmpmode) == MODE_INT
+ && ix86_valid_mask_cmp_mode (mode))
{
/* Using vector move with mask register. */
cmp = force_reg (cmpmode, cmp);
@@ -3846,7 +3847,7 @@ ix86_expand_fp_movcc (rtx operands[])
return true;
tmp = ix86_expand_sse_cmp (operands[0], code, op0, op1,
- operands[2], operands[3]);
+ operands[2], operands[3], false);
ix86_expand_sse_movcc (operands[0], tmp, operands[2], operands[3]);
return true;
}
@@ -4002,16 +4003,16 @@ ix86_expand_fp_vec_cmp (rtx operands[])
{
case LTGT:
temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
cmp = ix86_expand_sse_cmp (operands[0], NE, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
code = AND;
break;
case UNEQ:
temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[2],
- operands[3], NULL, NULL);
+ operands[3], NULL, NULL, true);
code = IOR;
break;
default:
@@ -4022,7 +4023,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
}
else
cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
- operands[1], operands[2]);
+ operands[1], operands[2], true);
if (operands[0] != cmp)
emit_move_insn (operands[0], cmp);
@@ -4032,7 +4033,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
static rtx
ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
- rtx op_true, rtx op_false, bool *negate)
+ rtx op_true, rtx op_false, bool *negate, bool vector_mask_p)
{
machine_mode data_mode = GET_MODE (dest);
machine_mode mode = GET_MODE (cop0);
@@ -4047,7 +4048,7 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
;
/* AVX512F supports all of the comparsions
on all 128/256/512-bit vector int types. */
- else if (ix86_valid_mask_cmp_mode (mode))
+ else if (!vector_mask_p && ix86_valid_mask_cmp_mode (mode))
;
else
{
@@ -4266,13 +4267,13 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
if (data_mode == mode)
{
x = ix86_expand_sse_cmp (dest, code, cop0, cop1,
- op_true, op_false);
+ op_true, op_false, vector_mask_p);
}
else
{
gcc_assert (GET_MODE_SIZE (data_mode) == GET_MODE_SIZE (mode));
x = ix86_expand_sse_cmp (gen_reg_rtx (mode), code, cop0, cop1,
- op_true, op_false);
+ op_true, op_false, vector_mask_p);
if (GET_MODE (x) == mode)
x = gen_lowpart (data_mode, x);
}
@@ -4288,7 +4289,7 @@ ix86_expand_int_vec_cmp (rtx operands[])
rtx_code code = GET_CODE (operands[1]);
bool negate = false;
rtx cmp = ix86_expand_int_sse_cmp (operands[0], code, operands[2],
- operands[3], NULL, NULL, &negate);
+ operands[3], NULL, NULL, &negate, true);
if (!cmp)
return false;
@@ -4296,7 +4297,7 @@ ix86_expand_int_vec_cmp (rtx operands[])
if (negate)
cmp = ix86_expand_int_sse_cmp (operands[0], EQ, cmp,
CONST0_RTX (GET_MODE (cmp)),
- NULL, NULL, &negate);
+ NULL, NULL, &negate, true);
gcc_assert (!negate);
@@ -4324,16 +4325,20 @@ ix86_expand_fp_vcond (rtx operands[])
{
case LTGT:
temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[4],
- operands[5], operands[0], operands[0]);
+ operands[5], operands[0], operands[0],
+ false);
cmp = ix86_expand_sse_cmp (operands[0], NE, operands[4],
- operands[5], operands[1], operands[2]);
+ operands[5], operands[1], operands[2],
+ false);
code = AND;
break;
case UNEQ:
temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[4],
- operands[5], operands[0], operands[0]);
+ operands[5], operands[0], operands[0],
+ false);
cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[4],
- operands[5], operands[1], operands[2]);
+ operands[5], operands[1], operands[2],
+ false);
code = IOR;
break;
default:
@@ -4350,7 +4355,7 @@ ix86_expand_fp_vcond (rtx operands[])
return true;
cmp = ix86_expand_sse_cmp (operands[0], code, operands[4], operands[5],
- operands[1], operands[2]);
+ operands[1], operands[2], false);
ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]);
return true;
}
@@ -4409,7 +4414,7 @@ ix86_expand_int_vcond (rtx operands[])
operands[2] = force_reg (data_mode, operands[2]);
x = ix86_expand_int_sse_cmp (operands[0], code, cop0, cop1,
- operands[1], operands[2], &negate);
+ operands[1], operands[2], &negate, false);
if (!x)
return false;
@@ -5076,7 +5081,7 @@ ix86_expand_sse_unpack (rtx dest, rtx src, bool unsigned_p, bool high_p)
tmp = force_reg (imode, CONST0_RTX (imode));
else
tmp = ix86_expand_sse_cmp (gen_reg_rtx (imode), GT, CONST0_RTX (imode),
- src, pc_rtx, pc_rtx);
+ src, pc_rtx, pc_rtx, true);
rtx tmp2 = gen_reg_rtx (imode);
emit_insn (unpack (tmp2, src, tmp));
@@ -20374,9 +20379,9 @@ ix86_expand_mul_widen_evenodd (rtx dest, rtx op1, rtx op2,
/* Compute the sign-extension, aka highparts, of the two operands. */
s1 = ix86_expand_sse_cmp (gen_reg_rtx (mode), GT, CONST0_RTX (mode),
- op1, pc_rtx, pc_rtx);
+ op1, pc_rtx, pc_rtx, true);
s2 = ix86_expand_sse_cmp (gen_reg_rtx (mode), GT, CONST0_RTX (mode),
- op2, pc_rtx, pc_rtx);
+ op2, pc_rtx, pc_rtx, true);
/* Multiply LO(A) * HI(B), and vice-versa. */
t1 = gen_reg_rtx (wmode);
diff --git a/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
new file mode 100644
index 00000000000..969684a222b
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
@@ -0,0 +1,11 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV char
+#define TYPEW short
+
+#define T_ARR \
+ __attribute__ ((target ("avx512vl,avx512bw")))
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
new file mode 100644
index 00000000000..b2ba91111da
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
@@ -0,0 +1,40 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#ifndef TYPEV
+#define TYPEV int
+#endif
+
+#ifndef TYPEW
+#define TYPEW long long
+#endif
+
+#ifndef T_ARR
+#define T_ARR \
+ __attribute__ ((target ("avx512vl")))
+#endif
+
+typedef TYPEV V __attribute__((__vector_size__(32)));
+typedef TYPEW W __attribute__((__vector_size__(32)));
+
+W c, d;
+struct B {};
+B e;
+struct C { W i; };
+void foo (C);
+
+C
+operator== (B, B)
+{
+ W r = (V)c == (V)d;
+ return {r};
+}
+
+void
+T_ARR
+bar ()
+{
+ B a;
+ foo (a == e);
+}
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
new file mode 100644
index 00000000000..42c9682746d
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
@@ -0,0 +1,8 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV float
+#define TYPEW double
+
+#include "avx512vl-pr98537-1.C"
--
2.18.1
next prev parent reply other threads:[~2021-01-07 5:19 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-06 6:49 Hongtao Liu
2021-01-06 14:39 ` Jakub Jelinek
2021-01-07 5:22 ` Hongtao Liu [this message]
2021-01-14 11:16 ` Hongtao Liu
2021-01-26 5:30 ` Hongtao Liu
2021-02-04 5:31 ` Hongtao Liu
2021-02-04 12:00 ` Jakub Jelinek
2021-02-05 1:52 ` Hongtao Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMZc-bxQAL8RPp14nPCVEo+9TGmNUxf42LFDRwFsri2YX2O=9A@mail.gmail.com' \
--to=crazylht@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=jakub@redhat.com \
--cc=kirill.yukhin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).