public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Hongtao Liu <crazylht@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Kirill Yukhin <kirill.yukhin@gmail.com>,
	"H. J. Lu" <hjl.tools@gmail.com>
Subject: Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]
Date: Thu, 7 Jan 2021 13:22:00 +0800	[thread overview]
Message-ID: <CAMZc-bxQAL8RPp14nPCVEo+9TGmNUxf42LFDRwFsri2YX2O=9A@mail.gmail.com> (raw)
In-Reply-To: <20210106143931.GC725145@tucnak>

[-- Attachment #1: Type: text/plain, Size: 3017 bytes --]

On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote:
> >   ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
> > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
> > called in upper 2 functions.) may return integer mask whenever integer
> > mask is available, so convert integer mask back to vector mask if
> > needed.
> >
> > gcc/ChangeLog:
> >
> >         PR target/98537
> >         * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
> >         When cmp is integer mask, convert it to vector.
> >         (ix86_expand_int_vec_cmp): Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR target/98537
> >         * g++.target/i386/avx512bw-pr98537-1.C: New test.
> >         * g++.target/i386/avx512vl-pr98537-1.C: New test.
> >         * g++.target/i386/avx512vl-pr98537-2.C: New test.
>
> Do we optimize it then to an AVX/AVX2 comparison if possible?
>
When i'm looking at the code, i find there's other places which
require comparison dest to be vector(i.e. ix86_expand_sse_unpack,
ix86_expand_mul_widen_evenodd). It's a potential bug.
So I fix this bug in another way which won't generate an integer mask
when the comparison dest is required to a vector mask.

Update patch:
  ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
comparison, considering that avx512 introduces integer mask, but some
original callers require the dest of comparison to be a vector.
So add a new parameter vector_mask_p to control the result
of vector comparison to be vector or not.
  regtested/bootstrapped on x86_64-linux-gnu{-m32,}.

gcc/ChangeLog:

        PR target/98537
        * config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
        parameter vector_mask_p to control whether the comparison
        result should be a vector or not.
        (ix86_expand_int_sse_cmp): Ditto.
        (ix86_expand_sse_movcc): cmpmode should be MODE_INT.
        (ix86_expand_fp_movcc): Allow vector comparison dest as
        integer mask.
        (ix86_expand_fp_vcond): Ditto.
        (ix86_expand_int_vcond): Ditto.
        (ix86_expand_fp_vec_cmp): Require vector comparison dest as
        vector.
        (ix86_expand_int_vec_cmp): Ditto.
        (ix86_expand_sse_unpack): Ditto.
        (ix86_expand_mul_widen_evenodd): Ditto.

gcc/testsuite/ChangeLog:

        PR target/98537
        * g++.target/i386/avx512bw-pr98537-1.C: New test.
        * g++.target/i386/avx512vl-pr98537-1.C: New test.
        * g++.target/i386/avx512vl-pr98537-2.C: New test.


> @@ -4024,8 +4025,18 @@ ix86_expand_fp_vec_cmp (rtx operands[])
>      cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
>                                operands[1], operands[2]);
>
> -  if (operands[0] != cmp)
> -    emit_move_insn (operands[0], cmp);
> +    if (operands[0] != cmp)
> +    {
>
> The indentation of the if above looks wrong.
> Otherwise LGTM.
>
>         Jakub
>


-- 
BR,
Hongtao

[-- Attachment #2: 0001-Fix-ICE-Convert-integer-mask-to-vector-in-ix86_expan_V2.patch --]
[-- Type: text/x-patch, Size: 11539 bytes --]

From bae4500e17f7590a45504c8c9e3ab0fe6200681d Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Thu, 7 Jan 2021 10:15:33 +0800
Subject: [PATCH] Fix ICE: Convert integer mask to vector in
 ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]

ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
comparison, considering that avx512 introduces integer mask, but some
original callers require the dest of comparison to be a vector.
So add a new parameter vector_mask_p to control the result
of vector comparison to be vector or not.

gcc/ChangeLog:

	PR target/98537
	* config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
	parameter vector_mask_p to control whether the comparison
	result should be a vector or not.
	(ix86_expand_int_sse_cmp): Ditto.
	(ix86_expand_sse_movcc): cmpmode should be MODE_INT.
	(ix86_expand_fp_movcc): Allow vector comparison dest as
	integer mask.
	(ix86_expand_fp_vcond): Ditto.
	(ix86_expand_int_vcond): Ditto.
	(ix86_expand_fp_vec_cmp): Require vector comparison dest as
	vector.
	(ix86_expand_int_vec_cmp): Ditto.
	(ix86_expand_sse_unpack): Ditto.
	(ix86_expand_mul_widen_evenodd): Ditto.

gcc/testsuite/ChangeLog:

	PR target/98537
	* g++.target/i386/avx512bw-pr98537-1.C: New test.
	* g++.target/i386/avx512vl-pr98537-1.C: New test.
	* g++.target/i386/avx512vl-pr98537-2.C: New test.
---
 gcc/config/i386/i386-expand.c                 | 63 ++++++++++---------
 .../g++.target/i386/avx512bw-pr98537-1.C      | 11 ++++
 .../g++.target/i386/avx512vl-pr98537-1.C      | 40 ++++++++++++
 .../g++.target/i386/avx512vl-pr98537-2.C      |  8 +++
 4 files changed, 93 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 6e08fd32726..1e4ef3b9f3f 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3469,11 +3469,12 @@ ix86_valid_mask_cmp_mode (machine_mode mode)
   return vector_size == 64 || TARGET_AVX512VL;
 }
 
-/* Expand an SSE comparison.  Return the register with the result.  */
+/* Expand an SSE comparison.  Return the register with the result.
+   If vector_mask_p is true, result of comparison should be a vector mask.  */
 
 static rtx
 ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
-		     rtx op_true, rtx op_false)
+		     rtx op_true, rtx op_false, bool vector_mask_p)
 {
   machine_mode mode = GET_MODE (dest);
   machine_mode cmp_ops_mode = GET_MODE (cmp_op0);
@@ -3485,7 +3486,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
   bool maskcmp = false;
   rtx x;
 
-  if (ix86_valid_mask_cmp_mode (cmp_ops_mode))
+  if (!vector_mask_p && ix86_valid_mask_cmp_mode (cmp_ops_mode))
     {
       unsigned int nbits = GET_MODE_NUNITS (cmp_ops_mode);
       maskcmp = true;
@@ -3517,7 +3518,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
 
   x = gen_rtx_fmt_ee (code, cmp_mode, cmp_op0, cmp_op1);
 
-  if (cmp_mode != mode && !maskcmp)
+  if (cmp_mode != mode)
     {
       x = force_reg (cmp_ops_mode, x);
       convert_move (dest, x, false);
@@ -3544,9 +3545,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
       return;
     }
 
-  /* In AVX512F the result of comparison is an integer mask.  */
-  bool maskcmp = mode != cmpmode && ix86_valid_mask_cmp_mode (mode);
-
   rtx t2, t3, x;
 
   /* If we have an integer mask and FP value then we need
@@ -3557,7 +3555,10 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
       cmp = gen_rtx_SUBREG (mode, cmp, 0);
     }
 
-  if (maskcmp)
+  /* In AVX512F the result of comparison is an integer mask.  */
+  if (mode != cmpmode
+      && GET_MODE_CLASS (cmpmode) == MODE_INT
+      && ix86_valid_mask_cmp_mode (mode))
     {
       /* Using vector move with mask register.  */
       cmp = force_reg (cmpmode, cmp);
@@ -3846,7 +3847,7 @@ ix86_expand_fp_movcc (rtx operands[])
 	return true;
 
       tmp = ix86_expand_sse_cmp (operands[0], code, op0, op1,
-				 operands[2], operands[3]);
+				 operands[2], operands[3], false);
       ix86_expand_sse_movcc (operands[0], tmp, operands[2], operands[3]);
       return true;
     }
@@ -4002,16 +4003,16 @@ ix86_expand_fp_vec_cmp (rtx operands[])
 	{
 	case LTGT:
 	  temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[2],
-				      operands[3], NULL, NULL);
+				      operands[3], NULL, NULL, true);
 	  cmp = ix86_expand_sse_cmp (operands[0], NE, operands[2],
-				     operands[3], NULL, NULL);
+				     operands[3], NULL, NULL, true);
 	  code = AND;
 	  break;
 	case UNEQ:
 	  temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[2],
-				      operands[3], NULL, NULL);
+				      operands[3], NULL, NULL, true);
 	  cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[2],
-				     operands[3], NULL, NULL);
+				     operands[3], NULL, NULL, true);
 	  code = IOR;
 	  break;
 	default:
@@ -4022,7 +4023,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
     }
   else
     cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
-			       operands[1], operands[2]);
+			       operands[1], operands[2], true);
 
   if (operands[0] != cmp)
     emit_move_insn (operands[0], cmp);
@@ -4032,7 +4033,7 @@ ix86_expand_fp_vec_cmp (rtx operands[])
 
 static rtx
 ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
-			 rtx op_true, rtx op_false, bool *negate)
+			 rtx op_true, rtx op_false, bool *negate, bool vector_mask_p)
 {
   machine_mode data_mode = GET_MODE (dest);
   machine_mode mode = GET_MODE (cop0);
@@ -4047,7 +4048,7 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
     ;
   /* AVX512F supports all of the comparsions
      on all 128/256/512-bit vector int types.  */
-  else if (ix86_valid_mask_cmp_mode (mode))
+  else if (!vector_mask_p && ix86_valid_mask_cmp_mode (mode))
     ;
   else
     {
@@ -4266,13 +4267,13 @@ ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
   if (data_mode == mode)
     {
       x = ix86_expand_sse_cmp (dest, code, cop0, cop1,
-			       op_true, op_false);
+			       op_true, op_false, vector_mask_p);
     }
   else
     {
       gcc_assert (GET_MODE_SIZE (data_mode) == GET_MODE_SIZE (mode));
       x = ix86_expand_sse_cmp (gen_reg_rtx (mode), code, cop0, cop1,
-			       op_true, op_false);
+			       op_true, op_false, vector_mask_p);
       if (GET_MODE (x) == mode)
 	x = gen_lowpart (data_mode, x);
     }
@@ -4288,7 +4289,7 @@ ix86_expand_int_vec_cmp (rtx operands[])
   rtx_code code = GET_CODE (operands[1]);
   bool negate = false;
   rtx cmp = ix86_expand_int_sse_cmp (operands[0], code, operands[2],
-				     operands[3], NULL, NULL, &negate);
+				     operands[3], NULL, NULL, &negate, true);
 
   if (!cmp)
     return false;
@@ -4296,7 +4297,7 @@ ix86_expand_int_vec_cmp (rtx operands[])
   if (negate)
     cmp = ix86_expand_int_sse_cmp (operands[0], EQ, cmp,
 				   CONST0_RTX (GET_MODE (cmp)),
-				   NULL, NULL, &negate);
+				   NULL, NULL, &negate, true);
 
   gcc_assert (!negate);
 
@@ -4324,16 +4325,20 @@ ix86_expand_fp_vcond (rtx operands[])
 	{
 	case LTGT:
 	  temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[4],
-				      operands[5], operands[0], operands[0]);
+				      operands[5], operands[0], operands[0],
+				      false);
 	  cmp = ix86_expand_sse_cmp (operands[0], NE, operands[4],
-				     operands[5], operands[1], operands[2]);
+				     operands[5], operands[1], operands[2],
+				     false);
 	  code = AND;
 	  break;
 	case UNEQ:
 	  temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[4],
-				      operands[5], operands[0], operands[0]);
+				      operands[5], operands[0], operands[0],
+				      false);
 	  cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[4],
-				     operands[5], operands[1], operands[2]);
+				     operands[5], operands[1], operands[2],
+				     false);
 	  code = IOR;
 	  break;
 	default:
@@ -4350,7 +4355,7 @@ ix86_expand_fp_vcond (rtx operands[])
     return true;
 
   cmp = ix86_expand_sse_cmp (operands[0], code, operands[4], operands[5],
-			     operands[1], operands[2]);
+			     operands[1], operands[2], false);
   ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]);
   return true;
 }
@@ -4409,7 +4414,7 @@ ix86_expand_int_vcond (rtx operands[])
     operands[2] = force_reg (data_mode, operands[2]);
 
   x = ix86_expand_int_sse_cmp (operands[0], code, cop0, cop1,
-			       operands[1], operands[2], &negate);
+			       operands[1], operands[2], &negate, false);
 
   if (!x)
     return false;
@@ -5076,7 +5081,7 @@ ix86_expand_sse_unpack (rtx dest, rtx src, bool unsigned_p, bool high_p)
 	tmp = force_reg (imode, CONST0_RTX (imode));
       else
 	tmp = ix86_expand_sse_cmp (gen_reg_rtx (imode), GT, CONST0_RTX (imode),
-				   src, pc_rtx, pc_rtx);
+				   src, pc_rtx, pc_rtx, true);
 
       rtx tmp2 = gen_reg_rtx (imode);
       emit_insn (unpack (tmp2, src, tmp));
@@ -20374,9 +20379,9 @@ ix86_expand_mul_widen_evenodd (rtx dest, rtx op1, rtx op2,
 
       /* Compute the sign-extension, aka highparts, of the two operands.  */
       s1 = ix86_expand_sse_cmp (gen_reg_rtx (mode), GT, CONST0_RTX (mode),
-				op1, pc_rtx, pc_rtx);
+				op1, pc_rtx, pc_rtx, true);
       s2 = ix86_expand_sse_cmp (gen_reg_rtx (mode), GT, CONST0_RTX (mode),
-				op2, pc_rtx, pc_rtx);
+				op2, pc_rtx, pc_rtx, true);
 
       /* Multiply LO(A) * HI(B), and vice-versa.  */
       t1 = gen_reg_rtx (wmode);
diff --git a/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
new file mode 100644
index 00000000000..969684a222b
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
@@ -0,0 +1,11 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV char
+#define TYPEW short
+
+#define T_ARR						\
+  __attribute__ ((target ("avx512vl,avx512bw")))
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
new file mode 100644
index 00000000000..b2ba91111da
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
@@ -0,0 +1,40 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#ifndef TYPEV
+#define TYPEV int
+#endif
+
+#ifndef TYPEW
+#define TYPEW long long
+#endif
+
+#ifndef T_ARR
+#define T_ARR					\
+  __attribute__ ((target ("avx512vl")))
+#endif
+
+typedef TYPEV V __attribute__((__vector_size__(32)));
+typedef TYPEW W __attribute__((__vector_size__(32)));
+
+W c, d;
+struct B {};
+B e;
+struct C { W i; };
+void foo (C);
+
+C
+operator== (B, B)
+{
+  W r = (V)c == (V)d;
+  return {r};
+}
+
+void
+T_ARR
+bar ()
+{
+  B a;
+  foo (a == e);
+}
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
new file mode 100644
index 00000000000..42c9682746d
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C
@@ -0,0 +1,8 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV float
+#define TYPEW double
+
+#include "avx512vl-pr98537-1.C"
-- 
2.18.1


  reply	other threads:[~2021-01-07  5:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-06  6:49 Hongtao Liu
2021-01-06 14:39 ` Jakub Jelinek
2021-01-07  5:22   ` Hongtao Liu [this message]
2021-01-14 11:16     ` Hongtao Liu
2021-01-26  5:30       ` Hongtao Liu
2021-02-04  5:31         ` Hongtao Liu
2021-02-04 12:00           ` Jakub Jelinek
2021-02-05  1:52             ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZc-bxQAL8RPp14nPCVEo+9TGmNUxf42LFDRwFsri2YX2O=9A@mail.gmail.com' \
    --to=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=jakub@redhat.com \
    --cc=kirill.yukhin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).