Help with PR97872

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Help with PR97872
@ 2020-12-01 10:59 Prathamesh Kulkarni
  2020-12-01 11:09 ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-01 10:59 UTC (permalink / raw)
  To: GCC Development, Richard Biener

Hi,
For the test mentioned in PR, I was trying to see if we could do
specialized expansion for vcond in target when operands are -1 and 0.
arm_expand_vcond gets the following operands:
(reg:V8QI 113 [ _2 ])
(reg:V8QI 117)
(reg:V8QI 118)
(lt (reg/v:V8QI 115 [ a ])
    (reg/v:V8QI 116 [ b ]))
(reg/v:V8QI 115 [ a ])
(reg/v:V8QI 116 [ b ])

where r117 and r118 are set to vector constants -1 and 0 respectively.
However, I am not sure if there's a way to check if the register is
constant during expansion time (since we don't have df analysis yet) ?

Alternatively, should we add a target hook that returns true if the
result of vector comparison is set to all-ones or all-zeros, and then
use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?

Thanks,
Prathamesh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-01 10:59 Help with PR97872 Prathamesh Kulkarni
@ 2020-12-01 11:09 ` Richard Biener
  2020-12-03  9:06   ` Prathamesh Kulkarni
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2020-12-01 11:09 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development

On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:

> Hi,
> For the test mentioned in PR, I was trying to see if we could do
> specialized expansion for vcond in target when operands are -1 and 0.
> arm_expand_vcond gets the following operands:
> (reg:V8QI 113 [ _2 ])
> (reg:V8QI 117)
> (reg:V8QI 118)
> (lt (reg/v:V8QI 115 [ a ])
>     (reg/v:V8QI 116 [ b ]))
> (reg/v:V8QI 115 [ a ])
> (reg/v:V8QI 116 [ b ])
> 
> where r117 and r118 are set to vector constants -1 and 0 respectively.
> However, I am not sure if there's a way to check if the register is
> constant during expansion time (since we don't have df analysis yet) ?
> 
> Alternatively, should we add a target hook that returns true if the
> result of vector comparison is set to all-ones or all-zeros, and then
> use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?

Would everything match-up for a .VEC_CMP IFN producing a non-mask
vector type?  ISEL could special case the a ? -1 : 0 case this way.

> Thanks,
> Prathamesh
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-01 11:09 ` Richard Biener
@ 2020-12-03  9:06   ` Prathamesh Kulkarni
  2020-12-03 11:05     ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-03  9:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

[-- Attachment #1: Type: text/plain, Size: 2140 bytes --]

On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
>
> On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
>
> > Hi,
> > For the test mentioned in PR, I was trying to see if we could do
> > specialized expansion for vcond in target when operands are -1 and 0.
> > arm_expand_vcond gets the following operands:
> > (reg:V8QI 113 [ _2 ])
> > (reg:V8QI 117)
> > (reg:V8QI 118)
> > (lt (reg/v:V8QI 115 [ a ])
> >     (reg/v:V8QI 116 [ b ]))
> > (reg/v:V8QI 115 [ a ])
> > (reg/v:V8QI 116 [ b ])
> >
> > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > However, I am not sure if there's a way to check if the register is
> > constant during expansion time (since we don't have df analysis yet) ?
> >
> > Alternatively, should we add a target hook that returns true if the
> > result of vector comparison is set to all-ones or all-zeros, and then
> > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
>
> Would everything match-up for a .VEC_CMP IFN producing a non-mask
> vector type?  ISEL could special case the a ? -1 : 0 case this way.
I think the vec_cmp pattern matches but it produces a masked vector type.
In the attached patch, I simply replaced:
_1 = a < b
x = _1 ? -1 : 0
with
x = view_convert_expr<_1>

For the test-case, isel generates:
  vector(8) <signed-boolean:8> _1;
  vector(8) signed char _2;
  uint8x8_t _5;

  <bb 2> [local count: 1073741824]:
  _1 = a_3(D) < b_4(D);
  _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
  _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
  return _5;

and results in desired code-gen:
f1:
        vcgt.s8 d0, d1, d0
        bx      lr

Altho I guess, we should remove the redundant conversions during isel itself ?
and result in:
_1 = a_3(D) < b_4(D)
_5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)

(Patch is lightly tested with only vect.exp)

Thanks,
Prathamesh
>
> > Thanks,
> > Prathamesh
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

[-- Attachment #2: pr97282-1.diff --]
[-- Type: application/octet-stream, Size: 1761 bytes --]

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 048b407bd11..56cb9dfc310 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -135,6 +135,26 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi,
   lhs = gimple_assign_lhs (stmt);
   machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
 
+  /* For targets where result of comparison is all-ones or all-zeros,
+     a < b ? -1 : 0 can be reduced to a < b.  */
+
+  if (integer_minus_onep (op1) && integer_zerop (op2))
+    {
+      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
+      tree op0a = gimple_assign_rhs1 (def_stmt);
+      tree op0_type = TREE_TYPE (op0);
+      tree op0a_type = TREE_TYPE (op0a);
+      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
+
+      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
+	{
+	  tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
+	  gassign *new_stmt = gimple_build_assign (lhs, conv_op);
+	  gsi_replace (gsi, new_stmt, true);
+	  return new_stmt;
+	}
+    }
+
   /* Lower mask typed, non-vector mode VEC_COND_EXPRs to bitwise operations.
      Those can end up generated by folding and at least for integer mode masks
      we cannot expect vcond expanders to exist.  We lower a ? b : c
diff --git a/gcc/testsuite/gcc.target/arm/pr97282.c b/gcc/testsuite/gcc.target/arm/pr97282.c
new file mode 100644
index 00000000000..eeb4dd9d6bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr97282.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+uint8x8_t f1(int8x8_t a, int8x8_t b) {
+  return a < b;
+}
+
+/* { dg-final { scan-assembler-not "vbsl" } } */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-03  9:06   ` Prathamesh Kulkarni
@ 2020-12-03 11:05     ` Richard Biener
  2020-12-04  9:22       ` Prathamesh Kulkarni
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2020-12-03 11:05 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development

On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:

> On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> >
> > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> >
> > > Hi,
> > > For the test mentioned in PR, I was trying to see if we could do
> > > specialized expansion for vcond in target when operands are -1 and 0.
> > > arm_expand_vcond gets the following operands:
> > > (reg:V8QI 113 [ _2 ])
> > > (reg:V8QI 117)
> > > (reg:V8QI 118)
> > > (lt (reg/v:V8QI 115 [ a ])
> > >     (reg/v:V8QI 116 [ b ]))
> > > (reg/v:V8QI 115 [ a ])
> > > (reg/v:V8QI 116 [ b ])
> > >
> > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > However, I am not sure if there's a way to check if the register is
> > > constant during expansion time (since we don't have df analysis yet) ?
> > >
> > > Alternatively, should we add a target hook that returns true if the
> > > result of vector comparison is set to all-ones or all-zeros, and then
> > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> >
> > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> I think the vec_cmp pattern matches but it produces a masked vector type.
> In the attached patch, I simply replaced:
> _1 = a < b
> x = _1 ? -1 : 0
> with
> x = view_convert_expr<_1>
> 
> For the test-case, isel generates:
>   vector(8) <signed-boolean:8> _1;
>   vector(8) signed char _2;
>   uint8x8_t _5;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = a_3(D) < b_4(D);
>   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
>   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
>   return _5;
> 
> and results in desired code-gen:
> f1:
>         vcgt.s8 d0, d1, d0
>         bx      lr
> 
> Altho I guess, we should remove the redundant conversions during isel itself ?
> and result in:
> _1 = a_3(D) < b_4(D)
> _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> 
> (Patch is lightly tested with only vect.exp)

+  /* For targets where result of comparison is all-ones or all-zeros,
+     a < b ? -1 : 0 can be reduced to a < b.  */
+
+  if (integer_minus_onep (op1) && integer_zerop (op2))
+    {

So this really belongs here:

          tree op0_type = TREE_TYPE (op0);
          tree op0a_type = TREE_TYPE (op0a);

<---

          if (used_vec_cond_exprs >= 2
              && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
                  != CODE_FOR_nothing)
              && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
            {


+      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
+      tree op0a = gimple_assign_rhs1 (def_stmt);
+      tree op0_type = TREE_TYPE (op0);
+      tree op0a_type = TREE_TYPE (op0a);
+      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
+
+      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
+       {
+         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
+         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
+         gsi_replace (gsi, new_stmt, true);

and you need to verify that the mode of the lhs and the mode of op0
agree and that the target can actually expand_vec_cmp_expr_p

Richard.



> Thanks,
> Prathamesh
> >
> > > Thanks,
> > > Prathamesh
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-03 11:05     ` Richard Biener
@ 2020-12-04  9:22       ` Prathamesh Kulkarni
  2020-12-04 11:48         ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-04  9:22 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

[-- Attachment #1: Type: text/plain, Size: 4323 bytes --]

On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
>
> On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
>
> > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > >
> > > > Hi,
> > > > For the test mentioned in PR, I was trying to see if we could do
> > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > arm_expand_vcond gets the following operands:
> > > > (reg:V8QI 113 [ _2 ])
> > > > (reg:V8QI 117)
> > > > (reg:V8QI 118)
> > > > (lt (reg/v:V8QI 115 [ a ])
> > > >     (reg/v:V8QI 116 [ b ]))
> > > > (reg/v:V8QI 115 [ a ])
> > > > (reg/v:V8QI 116 [ b ])
> > > >
> > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > However, I am not sure if there's a way to check if the register is
> > > > constant during expansion time (since we don't have df analysis yet) ?
> > > >
> > > > Alternatively, should we add a target hook that returns true if the
> > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > >
> > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > I think the vec_cmp pattern matches but it produces a masked vector type.
> > In the attached patch, I simply replaced:
> > _1 = a < b
> > x = _1 ? -1 : 0
> > with
> > x = view_convert_expr<_1>
> >
> > For the test-case, isel generates:
> >   vector(8) <signed-boolean:8> _1;
> >   vector(8) signed char _2;
> >   uint8x8_t _5;
> >
> >   <bb 2> [local count: 1073741824]:
> >   _1 = a_3(D) < b_4(D);
> >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> >   return _5;
> >
> > and results in desired code-gen:
> > f1:
> >         vcgt.s8 d0, d1, d0
> >         bx      lr
> >
> > Altho I guess, we should remove the redundant conversions during isel itself ?
> > and result in:
> > _1 = a_3(D) < b_4(D)
> > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> >
> > (Patch is lightly tested with only vect.exp)
>
> +  /* For targets where result of comparison is all-ones or all-zeros,
> +     a < b ? -1 : 0 can be reduced to a < b.  */
> +
> +  if (integer_minus_onep (op1) && integer_zerop (op2))
> +    {
>
> So this really belongs here:
>
>           tree op0_type = TREE_TYPE (op0);
>           tree op0a_type = TREE_TYPE (op0a);
>
> <---
>
>           if (used_vec_cond_exprs >= 2
>               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
>                   != CODE_FOR_nothing)
>               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
>             {
>
>
> +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> +      tree op0a = gimple_assign_rhs1 (def_stmt);
> +      tree op0_type = TREE_TYPE (op0);
> +      tree op0a_type = TREE_TYPE (op0a);
> +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> +
> +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> +       {
> +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> +         gsi_replace (gsi, new_stmt, true);
>
> and you need to verify that the mode of the lhs and the mode of op0
> agree and that the target can actually expand_vec_cmp_expr_p
Thanks for the suggestions, does the attached patch look OK ?
Sorry, I am not sure how to check if target can actually expand vec_cmp ?
I assume that since expand_vec_cmp_expr_p queries optab and if it gets
a valid cmp icode, that
should be sufficient ?

Thanks,
Prathamesh
>
> Richard.
>
>
>
> > Thanks,
> > Prathamesh
> > >
> > > > Thanks,
> > > > Prathamesh
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

[-- Attachment #2: pr97282-2.diff --]
[-- Type: application/octet-stream, Size: 1463 bytes --]

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 048b407bd11..1f9a17105ea 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -184,6 +184,21 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi,
 
 	  tree op0_type = TREE_TYPE (op0);
 	  tree op0a_type = TREE_TYPE (op0a);
+
+	  /* For targets where result of comparison is all-ones or all-zeros,
+	     a < b ? -1 : 0 can be reduced to a < b.  */
+
+	  if (integer_minus_onep (op1)
+	      && integer_zerop (op2)
+	      && TYPE_MODE (TREE_TYPE (lhs)) == TYPE_MODE (TREE_TYPE (op0))
+	      && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
+	    {
+	      tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
+	      gassign *new_stmt = gimple_build_assign (lhs, conv_op);
+	      gsi_replace (gsi, new_stmt, true);
+	      return new_stmt;
+	    }
+
 	  if (used_vec_cond_exprs >= 2
 	      && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
 		  != CODE_FOR_nothing)
diff --git a/gcc/testsuite/gcc.target/arm/pr97872.c b/gcc/testsuite/gcc.target/arm/pr97872.c
new file mode 100644
index 00000000000..eeb4dd9d6bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr97872.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+uint8x8_t f1(int8x8_t a, int8x8_t b) {
+  return a < b;
+}
+
+/* { dg-final { scan-assembler-not "vbsl" } } */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-04  9:22       ` Prathamesh Kulkarni
@ 2020-12-04 11:48         ` Richard Biener
  2020-12-07  5:53           ` Prathamesh Kulkarni
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2020-12-04 11:48 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development

On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:

> On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> >
> > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> >
> > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > >
> > > > > Hi,
> > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > arm_expand_vcond gets the following operands:
> > > > > (reg:V8QI 113 [ _2 ])
> > > > > (reg:V8QI 117)
> > > > > (reg:V8QI 118)
> > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > (reg/v:V8QI 115 [ a ])
> > > > > (reg/v:V8QI 116 [ b ])
> > > > >
> > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > However, I am not sure if there's a way to check if the register is
> > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > >
> > > > > Alternatively, should we add a target hook that returns true if the
> > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > >
> > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > In the attached patch, I simply replaced:
> > > _1 = a < b
> > > x = _1 ? -1 : 0
> > > with
> > > x = view_convert_expr<_1>
> > >
> > > For the test-case, isel generates:
> > >   vector(8) <signed-boolean:8> _1;
> > >   vector(8) signed char _2;
> > >   uint8x8_t _5;
> > >
> > >   <bb 2> [local count: 1073741824]:
> > >   _1 = a_3(D) < b_4(D);
> > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > >   return _5;
> > >
> > > and results in desired code-gen:
> > > f1:
> > >         vcgt.s8 d0, d1, d0
> > >         bx      lr
> > >
> > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > and result in:
> > > _1 = a_3(D) < b_4(D)
> > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > >
> > > (Patch is lightly tested with only vect.exp)
> >
> > +  /* For targets where result of comparison is all-ones or all-zeros,
> > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > +
> > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > +    {
> >
> > So this really belongs here:
> >
> >           tree op0_type = TREE_TYPE (op0);
> >           tree op0a_type = TREE_TYPE (op0a);
> >
> > <---
> >
> >           if (used_vec_cond_exprs >= 2
> >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> >                   != CODE_FOR_nothing)
> >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> >             {
> >
> >
> > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > +      tree op0_type = TREE_TYPE (op0);
> > +      tree op0a_type = TREE_TYPE (op0a);
> > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > +
> > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > +       {
> > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > +         gsi_replace (gsi, new_stmt, true);
> >
> > and you need to verify that the mode of the lhs and the mode of op0
> > agree and that the target can actually expand_vec_cmp_expr_p
> Thanks for the suggestions, does the attached patch look OK ?
> Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> a valid cmp icode, that
> should be sufficient ?

Yes

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> >
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-04 11:48         ` Richard Biener
@ 2020-12-07  5:53           ` Prathamesh Kulkarni
  2020-12-07  7:31             ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-07  5:53 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

[-- Attachment #1: Type: text/plain, Size: 6321 bytes --]

On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
>
> On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
>
> > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > >
> > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > Hi,
> > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > arm_expand_vcond gets the following operands:
> > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > (reg:V8QI 117)
> > > > > > (reg:V8QI 118)
> > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > (reg/v:V8QI 116 [ b ])
> > > > > >
> > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > > >
> > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > >
> > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > In the attached patch, I simply replaced:
> > > > _1 = a < b
> > > > x = _1 ? -1 : 0
> > > > with
> > > > x = view_convert_expr<_1>
> > > >
> > > > For the test-case, isel generates:
> > > >   vector(8) <signed-boolean:8> _1;
> > > >   vector(8) signed char _2;
> > > >   uint8x8_t _5;
> > > >
> > > >   <bb 2> [local count: 1073741824]:
> > > >   _1 = a_3(D) < b_4(D);
> > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > >   return _5;
> > > >
> > > > and results in desired code-gen:
> > > > f1:
> > > >         vcgt.s8 d0, d1, d0
> > > >         bx      lr
> > > >
> > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > and result in:
> > > > _1 = a_3(D) < b_4(D)
> > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > >
> > > > (Patch is lightly tested with only vect.exp)
> > >
> > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > +
> > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > +    {
> > >
> > > So this really belongs here:
> > >
> > >           tree op0_type = TREE_TYPE (op0);
> > >           tree op0a_type = TREE_TYPE (op0a);
> > >
> > > <---
> > >
> > >           if (used_vec_cond_exprs >= 2
> > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > >                   != CODE_FOR_nothing)
> > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > >             {
> > >
> > >
> > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > +      tree op0_type = TREE_TYPE (op0);
> > > +      tree op0a_type = TREE_TYPE (op0a);
> > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > +
> > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > +       {
> > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > +         gsi_replace (gsi, new_stmt, true);
> > >
> > > and you need to verify that the mode of the lhs and the mode of op0
> > > agree and that the target can actually expand_vec_cmp_expr_p
> > Thanks for the suggestions, does the attached patch look OK ?
> > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > a valid cmp icode, that
> > should be sufficient ?
>
> Yes
Hi Richard,
I tested the patch, and it shows one regression for pr78102.c, because
of extra pcmpeqq in code-gen for x != y on x86.
For the test-case:
__v2di
baz (const __v2di x, const __v2di y)
{
  return x != y;
}

Before patch:
baz:
        pcmpeqq %xmm1, %xmm0
        pcmpeqd %xmm1, %xmm1
        pandn   %xmm1, %xmm0
        ret

After patch,
Before ISEL:
  vector(2) <signed-boolean:64> _1;
  __v2di _4;

  <bb 2> [local count: 1073741824]:
  _1 = x_2(D) != y_3(D);
  _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
  return _4;

After ISEL:
  vector(2) <signed-boolean:64> _1;
  __v2di _4;

  <bb 2> [local count: 1073741824]:
  _1 = x_2(D) != y_3(D);
  _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
  return _4;

which results in:
        pcmpeqq %xmm1, %xmm0
        pxor    %xmm1, %xmm1
        pcmpeqq %xmm1, %xmm0
        ret

IIUC, the new code-gen is essentially comparing two args for equality, and then
comparing the result against zero to invert it, so it looks correct ?
I am not sure which of the above two sequences is better tho ?
If the new code-gen is OK, would it be OK to adjust the test-case ?

Thanks,
Prathamesh
>
> > Thanks,
> > Prathamesh
> > >
> > > Richard.
> > >
> > >
> > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

[-- Attachment #2: pr97872-2.diff --]
[-- Type: application/octet-stream, Size: 1463 bytes --]

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 048b407bd11..1f9a17105ea 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -184,6 +184,21 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi,
 
 	  tree op0_type = TREE_TYPE (op0);
 	  tree op0a_type = TREE_TYPE (op0a);
+
+	  /* For targets where result of comparison is all-ones or all-zeros,
+	     a < b ? -1 : 0 can be reduced to a < b.  */
+
+	  if (integer_minus_onep (op1)
+	      && integer_zerop (op2)
+	      && TYPE_MODE (TREE_TYPE (lhs)) == TYPE_MODE (TREE_TYPE (op0))
+	      && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
+	    {
+	      tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
+	      gassign *new_stmt = gimple_build_assign (lhs, conv_op);
+	      gsi_replace (gsi, new_stmt, true);
+	      return new_stmt;
+	    }
+
 	  if (used_vec_cond_exprs >= 2
 	      && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
 		  != CODE_FOR_nothing)
diff --git a/gcc/testsuite/gcc.target/arm/pr97872.c b/gcc/testsuite/gcc.target/arm/pr97872.c
new file mode 100644
index 00000000000..eeb4dd9d6bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr97872.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+uint8x8_t f1(int8x8_t a, int8x8_t b) {
+  return a < b;
+}
+
+/* { dg-final { scan-assembler-not "vbsl" } } */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-07  5:53           ` Prathamesh Kulkarni
@ 2020-12-07  7:31             ` Richard Biener
  2020-12-07  9:33               ` Prathamesh Kulkarni
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2020-12-07  7:31 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development, hongtao.liu

On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:

> On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> >
> > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> >
> > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > >
> > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > (reg:V8QI 117)
> > > > > > > (reg:V8QI 118)
> > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > >
> > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > > > >
> > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > >
> > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > In the attached patch, I simply replaced:
> > > > > _1 = a < b
> > > > > x = _1 ? -1 : 0
> > > > > with
> > > > > x = view_convert_expr<_1>
> > > > >
> > > > > For the test-case, isel generates:
> > > > >   vector(8) <signed-boolean:8> _1;
> > > > >   vector(8) signed char _2;
> > > > >   uint8x8_t _5;
> > > > >
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   _1 = a_3(D) < b_4(D);
> > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > >   return _5;
> > > > >
> > > > > and results in desired code-gen:
> > > > > f1:
> > > > >         vcgt.s8 d0, d1, d0
> > > > >         bx      lr
> > > > >
> > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > and result in:
> > > > > _1 = a_3(D) < b_4(D)
> > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > >
> > > > > (Patch is lightly tested with only vect.exp)
> > > >
> > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > +
> > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > +    {
> > > >
> > > > So this really belongs here:
> > > >
> > > >           tree op0_type = TREE_TYPE (op0);
> > > >           tree op0a_type = TREE_TYPE (op0a);
> > > >
> > > > <---
> > > >
> > > >           if (used_vec_cond_exprs >= 2
> > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > >                   != CODE_FOR_nothing)
> > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > >             {
> > > >
> > > >
> > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > +      tree op0_type = TREE_TYPE (op0);
> > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > +
> > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > +       {
> > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > +         gsi_replace (gsi, new_stmt, true);
> > > >
> > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > Thanks for the suggestions, does the attached patch look OK ?
> > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > a valid cmp icode, that
> > > should be sufficient ?
> >
> > Yes
> Hi Richard,
> I tested the patch, and it shows one regression for pr78102.c, because
> of extra pcmpeqq in code-gen for x != y on x86.
> For the test-case:
> __v2di
> baz (const __v2di x, const __v2di y)
> {
>   return x != y;
> }
> 
> Before patch:
> baz:
>         pcmpeqq %xmm1, %xmm0
>         pcmpeqd %xmm1, %xmm1
>         pandn   %xmm1, %xmm0
>         ret
> 
> After patch,
> Before ISEL:
>   vector(2) <signed-boolean:64> _1;
>   __v2di _4;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = x_2(D) != y_3(D);
>   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
>   return _4;
> 
> After ISEL:
>   vector(2) <signed-boolean:64> _1;
>   __v2di _4;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = x_2(D) != y_3(D);
>   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
>   return _4;
> 
> which results in:
>         pcmpeqq %xmm1, %xmm0
>         pxor    %xmm1, %xmm1
>         pcmpeqq %xmm1, %xmm0
>         ret
> 
> IIUC, the new code-gen is essentially comparing two args for equality, and then
> comparing the result against zero to invert it, so it looks correct ?
> I am not sure which of the above two sequences is better tho ?
> If the new code-gen is OK, would it be OK to adjust the test-case ?

In case pcmpeqq is double-issue the first variant might be faster while
the second variant has the advantage of the "free" pxor, but back-to-back
pcmpeqq might have an issue.

I think on GIMPLE the new code is preferable and adjustments are
target business.  I wouldn't be surprised if the x86 backend
special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
variant.

Did you check how

a = x != y ? { -1, -1 } : {0, 0 };
b = x != y ? { 1, 2 } : { 3, 4 };

is handled before/after your patch?  That is, make the comparison
CSEd between two VEC_COND_EXPRs?

Thanks,
Richard.


> Thanks,
> Prathamesh
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Richard.
> > > >
> > > >
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguenther@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-07  7:31             ` Richard Biener
@ 2020-12-07  9:33               ` Prathamesh Kulkarni
  2020-12-07  9:45                 ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-07  9:33 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development, hongtao.liu

On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
>
> On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
>
> > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > >
> > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > >
> > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > (reg:V8QI 117)
> > > > > > > > (reg:V8QI 118)
> > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > >
> > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > > > > >
> > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > >
> > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > In the attached patch, I simply replaced:
> > > > > > _1 = a < b
> > > > > > x = _1 ? -1 : 0
> > > > > > with
> > > > > > x = view_convert_expr<_1>
> > > > > >
> > > > > > For the test-case, isel generates:
> > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > >   vector(8) signed char _2;
> > > > > >   uint8x8_t _5;
> > > > > >
> > > > > >   <bb 2> [local count: 1073741824]:
> > > > > >   _1 = a_3(D) < b_4(D);
> > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > >   return _5;
> > > > > >
> > > > > > and results in desired code-gen:
> > > > > > f1:
> > > > > >         vcgt.s8 d0, d1, d0
> > > > > >         bx      lr
> > > > > >
> > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > and result in:
> > > > > > _1 = a_3(D) < b_4(D)
> > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > >
> > > > > > (Patch is lightly tested with only vect.exp)
> > > > >
> > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > +
> > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > +    {
> > > > >
> > > > > So this really belongs here:
> > > > >
> > > > >           tree op0_type = TREE_TYPE (op0);
> > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > >
> > > > > <---
> > > > >
> > > > >           if (used_vec_cond_exprs >= 2
> > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > >                   != CODE_FOR_nothing)
> > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > >             {
> > > > >
> > > > >
> > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > +
> > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > +       {
> > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > >
> > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > a valid cmp icode, that
> > > > should be sufficient ?
> > >
> > > Yes
> > Hi Richard,
> > I tested the patch, and it shows one regression for pr78102.c, because
> > of extra pcmpeqq in code-gen for x != y on x86.
> > For the test-case:
> > __v2di
> > baz (const __v2di x, const __v2di y)
> > {
> >   return x != y;
> > }
> >
> > Before patch:
> > baz:
> >         pcmpeqq %xmm1, %xmm0
> >         pcmpeqd %xmm1, %xmm1
> >         pandn   %xmm1, %xmm0
> >         ret
> >
> > After patch,
> > Before ISEL:
> >   vector(2) <signed-boolean:64> _1;
> >   __v2di _4;
> >
> >   <bb 2> [local count: 1073741824]:
> >   _1 = x_2(D) != y_3(D);
> >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> >   return _4;
> >
> > After ISEL:
> >   vector(2) <signed-boolean:64> _1;
> >   __v2di _4;
> >
> >   <bb 2> [local count: 1073741824]:
> >   _1 = x_2(D) != y_3(D);
> >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> >   return _4;
> >
> > which results in:
> >         pcmpeqq %xmm1, %xmm0
> >         pxor    %xmm1, %xmm1
> >         pcmpeqq %xmm1, %xmm0
> >         ret
> >
> > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > comparing the result against zero to invert it, so it looks correct ?
> > I am not sure which of the above two sequences is better tho ?
> > If the new code-gen is OK, would it be OK to adjust the test-case ?
>
> In case pcmpeqq is double-issue the first variant might be faster while
> the second variant has the advantage of the "free" pxor, but back-to-back
> pcmpeqq might have an issue.
>
> I think on GIMPLE the new code is preferable and adjustments are
> target business.  I wouldn't be surprised if the x86 backend
> special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> variant.
>
> Did you check how
>
> a = x != y ? { -1, -1 } : {0, 0 };
> b = x != y ? { 1, 2 } : { 3, 4 };
>
> is handled before/after your patch?  That is, make the comparison
> CSEd between two VEC_COND_EXPRs?
For the test-case:
__v2di f(__v2di, __v2di);

__v2di
baz (const __v2di x, const __v2di y)
{
  __v2di a = (x != y);
  __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
  return f (a, b);
}

Before patch, isel converts both to .vcondeq:
  __v2di b;
  __v2di a;
  __v2di _8;

  <bb 2> [local count: 1073741824]:
  a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
  b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
  _8 = f (a_4, b_5); [tail call]
  return _8;

and results in following code-gen:
_Z3bazDv2_xS_:
.LFB5666:
        pcmpeqq %xmm1, %xmm0
        pcmpeqd %xmm1, %xmm1
        movdqa  %xmm0, %xmm2
        pandn   %xmm1, %xmm2
        movdqa  .LC0(%rip), %xmm1
        pblendvb        %xmm0, .LC1(%rip), %xmm1
        movdqa  %xmm2, %xmm0
        jmp     _Z1fDv2_xS_

With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
view_convert_expr and the other
to vcondeq:
  __v2di b;
  __v2di a;
  vector(2) <signed-boolean:64> _1;
  __v2di _8;

  <bb 2> [local count: 1073741824]:
  _1 = x_2(D) != y_3(D);
  a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
  b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
  _8 = f (a_4, b_5); [tail call]
  return _8;

which results in following code-gen:
_Z3bazDv2_xS_:
.LFB5666:
        pcmpeqq %xmm1, %xmm0
        pxor    %xmm2, %xmm2
        movdqa  .LC0(%rip), %xmm1
        pblendvb        %xmm0, .LC1(%rip), %xmm1
        pcmpeqq %xmm0, %xmm2
        movdqa  %xmm2, %xmm0
        jmp     _Z1fDv2_xS_

Thanks,
Prathamesh
>
> Thanks,
> Richard.
>
>
> > Thanks,
> > Prathamesh
> > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > Richard.
> > > > >
> > > > >
> > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prathamesh
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-07  9:33               ` Prathamesh Kulkarni
@ 2020-12-07  9:45                 ` Richard Biener
  2020-12-07 10:47                   ` Hongtao Liu
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2020-12-07  9:45 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development, hongtao.liu

On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:

> On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> >
> > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> >
> > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > >
> > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > >
> > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > (reg:V8QI 117)
> > > > > > > > > (reg:V8QI 118)
> > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > >
> > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > > > > > >
> > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > >
> > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > In the attached patch, I simply replaced:
> > > > > > > _1 = a < b
> > > > > > > x = _1 ? -1 : 0
> > > > > > > with
> > > > > > > x = view_convert_expr<_1>
> > > > > > >
> > > > > > > For the test-case, isel generates:
> > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > >   vector(8) signed char _2;
> > > > > > >   uint8x8_t _5;
> > > > > > >
> > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > >   return _5;
> > > > > > >
> > > > > > > and results in desired code-gen:
> > > > > > > f1:
> > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > >         bx      lr
> > > > > > >
> > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > and result in:
> > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > >
> > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > >
> > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > +
> > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > +    {
> > > > > >
> > > > > > So this really belongs here:
> > > > > >
> > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > >
> > > > > > <---
> > > > > >
> > > > > >           if (used_vec_cond_exprs >= 2
> > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > >                   != CODE_FOR_nothing)
> > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > >             {
> > > > > >
> > > > > >
> > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > +
> > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > +       {
> > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > >
> > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > a valid cmp icode, that
> > > > > should be sufficient ?
> > > >
> > > > Yes
> > > Hi Richard,
> > > I tested the patch, and it shows one regression for pr78102.c, because
> > > of extra pcmpeqq in code-gen for x != y on x86.
> > > For the test-case:
> > > __v2di
> > > baz (const __v2di x, const __v2di y)
> > > {
> > >   return x != y;
> > > }
> > >
> > > Before patch:
> > > baz:
> > >         pcmpeqq %xmm1, %xmm0
> > >         pcmpeqd %xmm1, %xmm1
> > >         pandn   %xmm1, %xmm0
> > >         ret
> > >
> > > After patch,
> > > Before ISEL:
> > >   vector(2) <signed-boolean:64> _1;
> > >   __v2di _4;
> > >
> > >   <bb 2> [local count: 1073741824]:
> > >   _1 = x_2(D) != y_3(D);
> > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > >   return _4;
> > >
> > > After ISEL:
> > >   vector(2) <signed-boolean:64> _1;
> > >   __v2di _4;
> > >
> > >   <bb 2> [local count: 1073741824]:
> > >   _1 = x_2(D) != y_3(D);
> > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > >   return _4;
> > >
> > > which results in:
> > >         pcmpeqq %xmm1, %xmm0
> > >         pxor    %xmm1, %xmm1
> > >         pcmpeqq %xmm1, %xmm0
> > >         ret
> > >
> > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > comparing the result against zero to invert it, so it looks correct ?
> > > I am not sure which of the above two sequences is better tho ?
> > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> >
> > In case pcmpeqq is double-issue the first variant might be faster while
> > the second variant has the advantage of the "free" pxor, but back-to-back
> > pcmpeqq might have an issue.
> >
> > I think on GIMPLE the new code is preferable and adjustments are
> > target business.  I wouldn't be surprised if the x86 backend
> > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > variant.
> >
> > Did you check how
> >
> > a = x != y ? { -1, -1 } : {0, 0 };
> > b = x != y ? { 1, 2 } : { 3, 4 };
> >
> > is handled before/after your patch?  That is, make the comparison
> > CSEd between two VEC_COND_EXPRs?
> For the test-case:
> __v2di f(__v2di, __v2di);
> 
> __v2di
> baz (const __v2di x, const __v2di y)
> {
>   __v2di a = (x != y);
>   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
>   return f (a, b);
> }
> 
> Before patch, isel converts both to .vcondeq:
>   __v2di b;
>   __v2di a;
>   __v2di _8;
> 
>   <bb 2> [local count: 1073741824]:
>   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
>   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
>   _8 = f (a_4, b_5); [tail call]
>   return _8;
> 
> and results in following code-gen:
> _Z3bazDv2_xS_:
> .LFB5666:
>         pcmpeqq %xmm1, %xmm0
>         pcmpeqd %xmm1, %xmm1
>         movdqa  %xmm0, %xmm2
>         pandn   %xmm1, %xmm2
>         movdqa  .LC0(%rip), %xmm1
>         pblendvb        %xmm0, .LC1(%rip), %xmm1
>         movdqa  %xmm2, %xmm0
>         jmp     _Z1fDv2_xS_
> 
> With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> view_convert_expr and the other
> to vcondeq:
>   __v2di b;
>   __v2di a;
>   vector(2) <signed-boolean:64> _1;
>   __v2di _8;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = x_2(D) != y_3(D);
>   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
>   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
>   _8 = f (a_4, b_5); [tail call]
>   return _8;
> 
> which results in following code-gen:
> _Z3bazDv2_xS_:
> .LFB5666:
>         pcmpeqq %xmm1, %xmm0
>         pxor    %xmm2, %xmm2
>         movdqa  .LC0(%rip), %xmm1
>         pblendvb        %xmm0, .LC1(%rip), %xmm1
>         pcmpeqq %xmm0, %xmm2
>         movdqa  %xmm2, %xmm0
>         jmp     _Z1fDv2_xS_

Ok, thanks for checking.  I think the patch is OK but please let
Hongtao the chance to comment.

Richard.

> Thanks,
> Prathamesh
> >
> > Thanks,
> > Richard.
> >
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > > >
> > > > > > Richard.
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Prathamesh
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguenther@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-07  9:45                 ` Richard Biener
@ 2020-12-07 10:47                   ` Hongtao Liu
  2020-12-07 11:10                     ` Prathamesh Kulkarni
  0 siblings, 1 reply; 18+ messages in thread
From: Hongtao Liu @ 2020-12-07 10:47 UTC (permalink / raw)
  To: Richard Biener; +Cc: Prathamesh Kulkarni, GCC Development, Liu, Hongtao

On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
>
> On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
>
> > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > >
> > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > >
> > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > >
> > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?

It seems to me that all you need to do is relax the predicates of op1
and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
debugged it, but I see that vcondmn in neon.md only accepts
s_register_operand.

(define_expand "vcond<mode><mode>"
  [(set (match_operand:VDQW 0 "s_register_operand")
        (if_then_else:VDQW
          (match_operator 3 "comparison_operator"
            [(match_operand:VDQW 4 "s_register_operand")
             (match_operand:VDQW 5 "reg_or_zero_operand")])
          (match_operand:VDQW 1 "s_register_operand")
          (match_operand:VDQW 2 "s_register_operand")))]
  "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
{
  arm_expand_vcond (operands, <V_cmp_result>mode);
  DONE;
})

in sse.md it's defined as
(define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
  [(set (match_operand:V_512 0 "register_operand")
        (if_then_else:V_512
          (match_operator 3 ""
            [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
             (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
          (match_operand:V_512 1 "general_operand")
          (match_operand:V_512 2 "general_operand")))]
  "TARGET_AVX512F
   && (GET_MODE_NUNITS (<V_512:MODE>mode)
       == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
{
  bool ok = ix86_expand_int_vcond (operands);
  gcc_assert (ok);
  DONE;
})

then we can get operands[1] and operands[2] as

(gdb) p debug_rtx (operands[1])
 (const_vector:V16QI [
        (const_int -1 [0xffffffffffffffff]) repeated x16
    ])
(gdb) p debug_rtx (operands[2])
(reg:V16QI 82 [ _2 ])
(const_vector:V16QI [
        (const_int 0 [0]) repeated x16
    ])

> > > > > > > > > >
> > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > >
> > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > _1 = a < b
> > > > > > > > x = _1 ? -1 : 0
> > > > > > > > with
> > > > > > > > x = view_convert_expr<_1>
> > > > > > > >
> > > > > > > > For the test-case, isel generates:
> > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > >   vector(8) signed char _2;
> > > > > > > >   uint8x8_t _5;
> > > > > > > >
> > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > >   return _5;
> > > > > > > >
> > > > > > > > and results in desired code-gen:
> > > > > > > > f1:
> > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > >         bx      lr
> > > > > > > >
> > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > and result in:
> > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > >
> > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > >
> > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > +
> > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > +    {
> > > > > > >
> > > > > > > So this really belongs here:
> > > > > > >
> > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > >
> > > > > > > <---
> > > > > > >
> > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > >                   != CODE_FOR_nothing)
> > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > >             {
> > > > > > >
> > > > > > >
> > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > +
> > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > +       {
> > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > >
> > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > a valid cmp icode, that
> > > > > > should be sufficient ?
> > > > >
> > > > > Yes
> > > > Hi Richard,
> > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > For the test-case:
> > > > __v2di
> > > > baz (const __v2di x, const __v2di y)
> > > > {
> > > >   return x != y;
> > > > }
> > > >
> > > > Before patch:
> > > > baz:
> > > >         pcmpeqq %xmm1, %xmm0
> > > >         pcmpeqd %xmm1, %xmm1
> > > >         pandn   %xmm1, %xmm0
> > > >         ret
> > > >
> > > > After patch,
> > > > Before ISEL:
> > > >   vector(2) <signed-boolean:64> _1;
> > > >   __v2di _4;
> > > >
> > > >   <bb 2> [local count: 1073741824]:
> > > >   _1 = x_2(D) != y_3(D);
> > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > >   return _4;
> > > >
> > > > After ISEL:
> > > >   vector(2) <signed-boolean:64> _1;
> > > >   __v2di _4;
> > > >
> > > >   <bb 2> [local count: 1073741824]:
> > > >   _1 = x_2(D) != y_3(D);
> > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > >   return _4;
> > > >
> > > > which results in:
> > > >         pcmpeqq %xmm1, %xmm0
> > > >         pxor    %xmm1, %xmm1
> > > >         pcmpeqq %xmm1, %xmm0
> > > >         ret
> > > >
> > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > comparing the result against zero to invert it, so it looks correct ?
> > > > I am not sure which of the above two sequences is better tho ?
> > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > >
> > > In case pcmpeqq is double-issue the first variant might be faster while
> > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > pcmpeqq might have an issue.
> > >
> > > I think on GIMPLE the new code is preferable and adjustments are
> > > target business.  I wouldn't be surprised if the x86 backend
> > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > variant.
> > >
> > > Did you check how
> > >
> > > a = x != y ? { -1, -1 } : {0, 0 };
> > > b = x != y ? { 1, 2 } : { 3, 4 };
> > >
> > > is handled before/after your patch?  That is, make the comparison
> > > CSEd between two VEC_COND_EXPRs?
> > For the test-case:
> > __v2di f(__v2di, __v2di);
> >
> > __v2di
> > baz (const __v2di x, const __v2di y)
> > {
> >   __v2di a = (x != y);
> >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> >   return f (a, b);
> > }
> >
> > Before patch, isel converts both to .vcondeq:
> >   __v2di b;
> >   __v2di a;
> >   __v2di _8;
> >
> >   <bb 2> [local count: 1073741824]:
> >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> >   _8 = f (a_4, b_5); [tail call]
> >   return _8;
> >
> > and results in following code-gen:
> > _Z3bazDv2_xS_:
> > .LFB5666:
> >         pcmpeqq %xmm1, %xmm0
> >         pcmpeqd %xmm1, %xmm1
> >         movdqa  %xmm0, %xmm2
> >         pandn   %xmm1, %xmm2
> >         movdqa  .LC0(%rip), %xmm1
> >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> >         movdqa  %xmm2, %xmm0
> >         jmp     _Z1fDv2_xS_
> >
> > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > view_convert_expr and the other
> > to vcondeq:
> >   __v2di b;
> >   __v2di a;
> >   vector(2) <signed-boolean:64> _1;
> >   __v2di _8;
> >
> >   <bb 2> [local count: 1073741824]:
> >   _1 = x_2(D) != y_3(D);
> >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> >   _8 = f (a_4, b_5); [tail call]
> >   return _8;
> >
> > which results in following code-gen:
> > _Z3bazDv2_xS_:
> > .LFB5666:
> >         pcmpeqq %xmm1, %xmm0
> >         pxor    %xmm2, %xmm2
> >         movdqa  .LC0(%rip), %xmm1
> >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> >         pcmpeqq %xmm0, %xmm2
> >         movdqa  %xmm2, %xmm0
> >         jmp     _Z1fDv2_xS_
>
> Ok, thanks for checking.  I think the patch is OK but please let
> Hongtao the chance to comment.
>
> Richard.
>
> > Thanks,
> > Prathamesh
> > >
> > > Thanks,
> > > Richard.
> > >
> > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > > >
> > > > > > > Richard.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prathamesh
> > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Prathamesh
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-07 10:47                   ` Hongtao Liu
@ 2020-12-07 11:10                     ` Prathamesh Kulkarni
  2020-12-07 12:09                       ` Hongtao Liu
  0 siblings, 1 reply; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-07 11:10 UTC (permalink / raw)
  To: Hongtao Liu; +Cc: Richard Biener, GCC Development, Liu, Hongtao

On Mon, 7 Dec 2020 at 16:15, Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
> >
> > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> >
> > > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > >
> > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > >
> > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi,
> > > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > >
> > > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
>
> It seems to me that all you need to do is relax the predicates of op1
> and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> debugged it, but I see that vcondmn in neon.md only accepts
> s_register_operand.
>
> (define_expand "vcond<mode><mode>"
>   [(set (match_operand:VDQW 0 "s_register_operand")
>         (if_then_else:VDQW
>           (match_operator 3 "comparison_operator"
>             [(match_operand:VDQW 4 "s_register_operand")
>              (match_operand:VDQW 5 "reg_or_zero_operand")])
>           (match_operand:VDQW 1 "s_register_operand")
>           (match_operand:VDQW 2 "s_register_operand")))]
>   "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> {
>   arm_expand_vcond (operands, <V_cmp_result>mode);
>   DONE;
> })
>
> in sse.md it's defined as
> (define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
>   [(set (match_operand:V_512 0 "register_operand")
>         (if_then_else:V_512
>           (match_operator 3 ""
>             [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
>              (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
>           (match_operand:V_512 1 "general_operand")
>           (match_operand:V_512 2 "general_operand")))]
>   "TARGET_AVX512F
>    && (GET_MODE_NUNITS (<V_512:MODE>mode)
>        == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
> {
>   bool ok = ix86_expand_int_vcond (operands);
>   gcc_assert (ok);
>   DONE;
> })
>
> then we can get operands[1] and operands[2] as
>
> (gdb) p debug_rtx (operands[1])
>  (const_vector:V16QI [
>         (const_int -1 [0xffffffffffffffff]) repeated x16
>     ])
> (gdb) p debug_rtx (operands[2])
> (reg:V16QI 82 [ _2 ])
> (const_vector:V16QI [
>         (const_int 0 [0]) repeated x16
>     ])
Hi Hongtao,
Thanks for the suggestions!
However IIUC from vector extensions doc page, the result of vector
comparison is defined to be 0
or -1, so would it be better to canonicalize
x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
adjust targets if required ?
Alternatively, I could try fixing this in backend as you suggest above.

Thanks,
Prathamesh
>
> > > > > > > > > > >
> > > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > > >
> > > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > > _1 = a < b
> > > > > > > > > x = _1 ? -1 : 0
> > > > > > > > > with
> > > > > > > > > x = view_convert_expr<_1>
> > > > > > > > >
> > > > > > > > > For the test-case, isel generates:
> > > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > > >   vector(8) signed char _2;
> > > > > > > > >   uint8x8_t _5;
> > > > > > > > >
> > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > > >   return _5;
> > > > > > > > >
> > > > > > > > > and results in desired code-gen:
> > > > > > > > > f1:
> > > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > > >         bx      lr
> > > > > > > > >
> > > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > > and result in:
> > > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > > >
> > > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > > >
> > > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > > +
> > > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > > +    {
> > > > > > > >
> > > > > > > > So this really belongs here:
> > > > > > > >
> > > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > > >
> > > > > > > > <---
> > > > > > > >
> > > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > > >                   != CODE_FOR_nothing)
> > > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > >             {
> > > > > > > >
> > > > > > > >
> > > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > > +
> > > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > +       {
> > > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > > >
> > > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > > a valid cmp icode, that
> > > > > > > should be sufficient ?
> > > > > >
> > > > > > Yes
> > > > > Hi Richard,
> > > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > > For the test-case:
> > > > > __v2di
> > > > > baz (const __v2di x, const __v2di y)
> > > > > {
> > > > >   return x != y;
> > > > > }
> > > > >
> > > > > Before patch:
> > > > > baz:
> > > > >         pcmpeqq %xmm1, %xmm0
> > > > >         pcmpeqd %xmm1, %xmm1
> > > > >         pandn   %xmm1, %xmm0
> > > > >         ret
> > > > >
> > > > > After patch,
> > > > > Before ISEL:
> > > > >   vector(2) <signed-boolean:64> _1;
> > > > >   __v2di _4;
> > > > >
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   _1 = x_2(D) != y_3(D);
> > > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > > >   return _4;
> > > > >
> > > > > After ISEL:
> > > > >   vector(2) <signed-boolean:64> _1;
> > > > >   __v2di _4;
> > > > >
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   _1 = x_2(D) != y_3(D);
> > > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > >   return _4;
> > > > >
> > > > > which results in:
> > > > >         pcmpeqq %xmm1, %xmm0
> > > > >         pxor    %xmm1, %xmm1
> > > > >         pcmpeqq %xmm1, %xmm0
> > > > >         ret
> > > > >
> > > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > > comparing the result against zero to invert it, so it looks correct ?
> > > > > I am not sure which of the above two sequences is better tho ?
> > > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > > >
> > > > In case pcmpeqq is double-issue the first variant might be faster while
> > > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > > pcmpeqq might have an issue.
> > > >
> > > > I think on GIMPLE the new code is preferable and adjustments are
> > > > target business.  I wouldn't be surprised if the x86 backend
> > > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > > variant.
> > > >
> > > > Did you check how
> > > >
> > > > a = x != y ? { -1, -1 } : {0, 0 };
> > > > b = x != y ? { 1, 2 } : { 3, 4 };
> > > >
> > > > is handled before/after your patch?  That is, make the comparison
> > > > CSEd between two VEC_COND_EXPRs?
> > > For the test-case:
> > > __v2di f(__v2di, __v2di);
> > >
> > > __v2di
> > > baz (const __v2di x, const __v2di y)
> > > {
> > >   __v2di a = (x != y);
> > >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> > >   return f (a, b);
> > > }
> > >
> > > Before patch, isel converts both to .vcondeq:
> > >   __v2di b;
> > >   __v2di a;
> > >   __v2di _8;
> > >
> > >   <bb 2> [local count: 1073741824]:
> > >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > >   _8 = f (a_4, b_5); [tail call]
> > >   return _8;
> > >
> > > and results in following code-gen:
> > > _Z3bazDv2_xS_:
> > > .LFB5666:
> > >         pcmpeqq %xmm1, %xmm0
> > >         pcmpeqd %xmm1, %xmm1
> > >         movdqa  %xmm0, %xmm2
> > >         pandn   %xmm1, %xmm2
> > >         movdqa  .LC0(%rip), %xmm1
> > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > >         movdqa  %xmm2, %xmm0
> > >         jmp     _Z1fDv2_xS_
> > >
> > > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > > view_convert_expr and the other
> > > to vcondeq:
> > >   __v2di b;
> > >   __v2di a;
> > >   vector(2) <signed-boolean:64> _1;
> > >   __v2di _8;
> > >
> > >   <bb 2> [local count: 1073741824]:
> > >   _1 = x_2(D) != y_3(D);
> > >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > >   _8 = f (a_4, b_5); [tail call]
> > >   return _8;
> > >
> > > which results in following code-gen:
> > > _Z3bazDv2_xS_:
> > > .LFB5666:
> > >         pcmpeqq %xmm1, %xmm0
> > >         pxor    %xmm2, %xmm2
> > >         movdqa  .LC0(%rip), %xmm1
> > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > >         pcmpeqq %xmm0, %xmm2
> > >         movdqa  %xmm2, %xmm0
> > >         jmp     _Z1fDv2_xS_
> >
> > Ok, thanks for checking.  I think the patch is OK but please let
> > Hongtao the chance to comment.
> >
> > Richard.
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Thanks,
> > > > Richard.
> > > >
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > > >
> > > > > > > > Richard.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Prathamesh
> > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Prathamesh
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguenther@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
>
>
>
> --
> BR,
> Hongtao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-07 11:10                     ` Prathamesh Kulkarni
@ 2020-12-07 12:09                       ` Hongtao Liu
  2020-12-08  9:06                         ` Prathamesh Kulkarni
  0 siblings, 1 reply; 18+ messages in thread
From: Hongtao Liu @ 2020-12-07 12:09 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Richard Biener, GCC Development, Liu, Hongtao

On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
>
> On Mon, 7 Dec 2020 at 16:15, Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > >
> > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > > >
> > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > >
> > > > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> >
> > It seems to me that all you need to do is relax the predicates of op1
> > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > debugged it, but I see that vcondmn in neon.md only accepts
> > s_register_operand.
> >
> > (define_expand "vcond<mode><mode>"
> >   [(set (match_operand:VDQW 0 "s_register_operand")
> >         (if_then_else:VDQW
> >           (match_operator 3 "comparison_operator"
> >             [(match_operand:VDQW 4 "s_register_operand")
> >              (match_operand:VDQW 5 "reg_or_zero_operand")])
> >           (match_operand:VDQW 1 "s_register_operand")
> >           (match_operand:VDQW 2 "s_register_operand")))]
> >   "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> > {
> >   arm_expand_vcond (operands, <V_cmp_result>mode);
> >   DONE;
> > })
> >
> > in sse.md it's defined as
> > (define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
> >   [(set (match_operand:V_512 0 "register_operand")
> >         (if_then_else:V_512
> >           (match_operator 3 ""
> >             [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> >              (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> >           (match_operand:V_512 1 "general_operand")
> >           (match_operand:V_512 2 "general_operand")))]
> >   "TARGET_AVX512F
> >    && (GET_MODE_NUNITS (<V_512:MODE>mode)
> >        == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
> > {
> >   bool ok = ix86_expand_int_vcond (operands);
> >   gcc_assert (ok);
> >   DONE;
> > })
> >
> > then we can get operands[1] and operands[2] as
> >
> > (gdb) p debug_rtx (operands[1])
> >  (const_vector:V16QI [
> >         (const_int -1 [0xffffffffffffffff]) repeated x16
> >     ])
> > (gdb) p debug_rtx (operands[2])
> > (reg:V16QI 82 [ _2 ])
> > (const_vector:V16QI [
> >         (const_int 0 [0]) repeated x16
> >     ])
> Hi Hongtao,
> Thanks for the suggestions!
> However IIUC from vector extensions doc page, the result of vector
> comparison is defined to be 0
> or -1, so would it be better to canonicalize
> x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> adjust targets if required ?

Yes, it would be more straightforward to handle it in gimple isel, I
would adjust the backend and testcase after you check in the patch.

> Alternatively, I could try fixing this in backend as you suggest above.
>
> Thanks,
> Prathamesh
> >
> > > > > > > > > > > >
> > > > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > > > >
> > > > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > > > _1 = a < b
> > > > > > > > > > x = _1 ? -1 : 0
> > > > > > > > > > with
> > > > > > > > > > x = view_convert_expr<_1>
> > > > > > > > > >
> > > > > > > > > > For the test-case, isel generates:
> > > > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > > > >   vector(8) signed char _2;
> > > > > > > > > >   uint8x8_t _5;
> > > > > > > > > >
> > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > > > >   return _5;
> > > > > > > > > >
> > > > > > > > > > and results in desired code-gen:
> > > > > > > > > > f1:
> > > > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > > > >         bx      lr
> > > > > > > > > >
> > > > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > > > and result in:
> > > > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > > > >
> > > > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > > > >
> > > > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > > > +
> > > > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > > > +    {
> > > > > > > > >
> > > > > > > > > So this really belongs here:
> > > > > > > > >
> > > > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > >
> > > > > > > > > <---
> > > > > > > > >
> > > > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > > > >                   != CODE_FOR_nothing)
> > > > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > >             {
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > > > +
> > > > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > +       {
> > > > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > > > >
> > > > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > > > a valid cmp icode, that
> > > > > > > > should be sufficient ?
> > > > > > >
> > > > > > > Yes
> > > > > > Hi Richard,
> > > > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > > > For the test-case:
> > > > > > __v2di
> > > > > > baz (const __v2di x, const __v2di y)
> > > > > > {
> > > > > >   return x != y;
> > > > > > }
> > > > > >
> > > > > > Before patch:
> > > > > > baz:
> > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > >         pandn   %xmm1, %xmm0
> > > > > >         ret
> > > > > >
> > > > > > After patch,
> > > > > > Before ISEL:
> > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > >   __v2di _4;
> > > > > >
> > > > > >   <bb 2> [local count: 1073741824]:
> > > > > >   _1 = x_2(D) != y_3(D);
> > > > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > > > >   return _4;
> > > > > >
> > > > > > After ISEL:
> > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > >   __v2di _4;
> > > > > >
> > > > > >   <bb 2> [local count: 1073741824]:
> > > > > >   _1 = x_2(D) != y_3(D);
> > > > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > >   return _4;
> > > > > >
> > > > > > which results in:
> > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > >         pxor    %xmm1, %xmm1
> > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > >         ret
seems better to be

         pcmpeqq %xmm1, %xmm0
         pxor    %xmm1, %xmm1
         pxor %xmm1, %xmm0
         ret

Anyway, it needs backend adjustment.

> > > > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > > > comparing the result against zero to invert it, so it looks correct ?
> > > > > > I am not sure which of the above two sequences is better tho ?
> > > > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > > > >
> > > > > In case pcmpeqq is double-issue the first variant might be faster while
> > > > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > > > pcmpeqq might have an issue.
> > > > >
> > > > > I think on GIMPLE the new code is preferable and adjustments are
> > > > > target business.  I wouldn't be surprised if the x86 backend
> > > > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > > > variant.
> > > > >
> > > > > Did you check how
> > > > >
> > > > > a = x != y ? { -1, -1 } : {0, 0 };
> > > > > b = x != y ? { 1, 2 } : { 3, 4 };
> > > > >
> > > > > is handled before/after your patch?  That is, make the comparison
> > > > > CSEd between two VEC_COND_EXPRs?
> > > > For the test-case:
> > > > __v2di f(__v2di, __v2di);
> > > >
> > > > __v2di
> > > > baz (const __v2di x, const __v2di y)
> > > > {
> > > >   __v2di a = (x != y);
> > > >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> > > >   return f (a, b);
> > > > }
> > > >
> > > > Before patch, isel converts both to .vcondeq:
> > > >   __v2di b;
> > > >   __v2di a;
> > > >   __v2di _8;
> > > >
> > > >   <bb 2> [local count: 1073741824]:
> > > >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > >   _8 = f (a_4, b_5); [tail call]
> > > >   return _8;
> > > >
> > > > and results in following code-gen:
> > > > _Z3bazDv2_xS_:
> > > > .LFB5666:
> > > >         pcmpeqq %xmm1, %xmm0
> > > >         pcmpeqd %xmm1, %xmm1
> > > >         movdqa  %xmm0, %xmm2
> > > >         pandn   %xmm1, %xmm2
> > > >         movdqa  .LC0(%rip), %xmm1
> > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > >         movdqa  %xmm2, %xmm0
> > > >         jmp     _Z1fDv2_xS_
> > > >
> > > > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > > > view_convert_expr and the other
> > > > to vcondeq:
> > > >   __v2di b;
> > > >   __v2di a;
> > > >   vector(2) <signed-boolean:64> _1;
> > > >   __v2di _8;
> > > >
> > > >   <bb 2> [local count: 1073741824]:
> > > >   _1 = x_2(D) != y_3(D);
> > > >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > >   _8 = f (a_4, b_5); [tail call]
> > > >   return _8;
> > > >
> > > > which results in following code-gen:
> > > > _Z3bazDv2_xS_:
> > > > .LFB5666:
> > > >         pcmpeqq %xmm1, %xmm0
> > > >         pxor    %xmm2, %xmm2
> > > >         movdqa  .LC0(%rip), %xmm1
> > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > >         pcmpeqq %xmm0, %xmm2
> > > >         movdqa  %xmm2, %xmm0
> > > >         jmp     _Z1fDv2_xS_
> > >
> > > Ok, thanks for checking.  I think the patch is OK but please let
> > > Hongtao the chance to comment.
> > >
> > > Richard.
> > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > Thanks,
> > > > > Richard.
> > > > >
> > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prathamesh
> > > > > > > > >
> > > > > > > > > Richard.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Prathamesh
> > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Prathamesh
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-07 12:09                       ` Hongtao Liu
@ 2020-12-08  9:06                         ` Prathamesh Kulkarni
  2020-12-09 11:47                           ` Prathamesh Kulkarni
  0 siblings, 1 reply; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-08  9:06 UTC (permalink / raw)
  To: Hongtao Liu; +Cc: Richard Biener, GCC Development, Liu, Hongtao, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 16081 bytes --]

On Mon, 7 Dec 2020 at 17:37, Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
> <prathamesh.kulkarni@linaro.org> wrote:
> >
> > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > > > > >
> > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > >
> > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > >
> > > > > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > >
> > > It seems to me that all you need to do is relax the predicates of op1
> > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > debugged it, but I see that vcondmn in neon.md only accepts
> > > s_register_operand.
> > >
> > > (define_expand "vcond<mode><mode>"
> > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > >         (if_then_else:VDQW
> > >           (match_operator 3 "comparison_operator"
> > >             [(match_operand:VDQW 4 "s_register_operand")
> > >              (match_operand:VDQW 5 "reg_or_zero_operand")])
> > >           (match_operand:VDQW 1 "s_register_operand")
> > >           (match_operand:VDQW 2 "s_register_operand")))]
> > >   "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> > > {
> > >   arm_expand_vcond (operands, <V_cmp_result>mode);
> > >   DONE;
> > > })
> > >
> > > in sse.md it's defined as
> > > (define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
> > >   [(set (match_operand:V_512 0 "register_operand")
> > >         (if_then_else:V_512
> > >           (match_operator 3 ""
> > >             [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > >              (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > >           (match_operand:V_512 1 "general_operand")
> > >           (match_operand:V_512 2 "general_operand")))]
> > >   "TARGET_AVX512F
> > >    && (GET_MODE_NUNITS (<V_512:MODE>mode)
> > >        == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
> > > {
> > >   bool ok = ix86_expand_int_vcond (operands);
> > >   gcc_assert (ok);
> > >   DONE;
> > > })
> > >
> > > then we can get operands[1] and operands[2] as
> > >
> > > (gdb) p debug_rtx (operands[1])
> > >  (const_vector:V16QI [
> > >         (const_int -1 [0xffffffffffffffff]) repeated x16
> > >     ])
> > > (gdb) p debug_rtx (operands[2])
> > > (reg:V16QI 82 [ _2 ])
> > > (const_vector:V16QI [
> > >         (const_int 0 [0]) repeated x16
> > >     ])
> > Hi Hongtao,
> > Thanks for the suggestions!
> > However IIUC from vector extensions doc page, the result of vector
> > comparison is defined to be 0
> > or -1, so would it be better to canonicalize
> > x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> > adjust targets if required ?
>
> Yes, it would be more straightforward to handle it in gimple isel, I
> would adjust the backend and testcase after you check in the patch.
Thanks! I have committed the attached patch in
3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1.

Regards,
Prathamesh
>
> > Alternatively, I could try fixing this in backend as you suggest above.
> >
> > Thanks,
> > Prathamesh
> > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > > > > >
> > > > > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > > > > _1 = a < b
> > > > > > > > > > > x = _1 ? -1 : 0
> > > > > > > > > > > with
> > > > > > > > > > > x = view_convert_expr<_1>
> > > > > > > > > > >
> > > > > > > > > > > For the test-case, isel generates:
> > > > > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > > > > >   vector(8) signed char _2;
> > > > > > > > > > >   uint8x8_t _5;
> > > > > > > > > > >
> > > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > > > > >   return _5;
> > > > > > > > > > >
> > > > > > > > > > > and results in desired code-gen:
> > > > > > > > > > > f1:
> > > > > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > > > > >         bx      lr
> > > > > > > > > > >
> > > > > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > > > > and result in:
> > > > > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > > > > >
> > > > > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > > > > >
> > > > > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > > > > +
> > > > > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > > > > +    {
> > > > > > > > > >
> > > > > > > > > > So this really belongs here:
> > > > > > > > > >
> > > > > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > >
> > > > > > > > > > <---
> > > > > > > > > >
> > > > > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > > > > >                   != CODE_FOR_nothing)
> > > > > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > >             {
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > > > > +
> > > > > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > +       {
> > > > > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > > > > >
> > > > > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > > > > a valid cmp icode, that
> > > > > > > > > should be sufficient ?
> > > > > > > >
> > > > > > > > Yes
> > > > > > > Hi Richard,
> > > > > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > > > > For the test-case:
> > > > > > > __v2di
> > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > {
> > > > > > >   return x != y;
> > > > > > > }
> > > > > > >
> > > > > > > Before patch:
> > > > > > > baz:
> > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > >         pandn   %xmm1, %xmm0
> > > > > > >         ret
> > > > > > >
> > > > > > > After patch,
> > > > > > > Before ISEL:
> > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > >   __v2di _4;
> > > > > > >
> > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > > > > >   return _4;
> > > > > > >
> > > > > > > After ISEL:
> > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > >   __v2di _4;
> > > > > > >
> > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > >   return _4;
> > > > > > >
> > > > > > > which results in:
> > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > >         pxor    %xmm1, %xmm1
> > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > >         ret
> seems better to be
>
>          pcmpeqq %xmm1, %xmm0
>          pxor    %xmm1, %xmm1
>          pxor %xmm1, %xmm0
>          ret
>
> Anyway, it needs backend adjustment.
>
> > > > > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > > > > comparing the result against zero to invert it, so it looks correct ?
> > > > > > > I am not sure which of the above two sequences is better tho ?
> > > > > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > > > > >
> > > > > > In case pcmpeqq is double-issue the first variant might be faster while
> > > > > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > > > > pcmpeqq might have an issue.
> > > > > >
> > > > > > I think on GIMPLE the new code is preferable and adjustments are
> > > > > > target business.  I wouldn't be surprised if the x86 backend
> > > > > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > > > > variant.
> > > > > >
> > > > > > Did you check how
> > > > > >
> > > > > > a = x != y ? { -1, -1 } : {0, 0 };
> > > > > > b = x != y ? { 1, 2 } : { 3, 4 };
> > > > > >
> > > > > > is handled before/after your patch?  That is, make the comparison
> > > > > > CSEd between two VEC_COND_EXPRs?
> > > > > For the test-case:
> > > > > __v2di f(__v2di, __v2di);
> > > > >
> > > > > __v2di
> > > > > baz (const __v2di x, const __v2di y)
> > > > > {
> > > > >   __v2di a = (x != y);
> > > > >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> > > > >   return f (a, b);
> > > > > }
> > > > >
> > > > > Before patch, isel converts both to .vcondeq:
> > > > >   __v2di b;
> > > > >   __v2di a;
> > > > >   __v2di _8;
> > > > >
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > >   _8 = f (a_4, b_5); [tail call]
> > > > >   return _8;
> > > > >
> > > > > and results in following code-gen:
> > > > > _Z3bazDv2_xS_:
> > > > > .LFB5666:
> > > > >         pcmpeqq %xmm1, %xmm0
> > > > >         pcmpeqd %xmm1, %xmm1
> > > > >         movdqa  %xmm0, %xmm2
> > > > >         pandn   %xmm1, %xmm2
> > > > >         movdqa  .LC0(%rip), %xmm1
> > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > >         movdqa  %xmm2, %xmm0
> > > > >         jmp     _Z1fDv2_xS_
> > > > >
> > > > > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > > > > view_convert_expr and the other
> > > > > to vcondeq:
> > > > >   __v2di b;
> > > > >   __v2di a;
> > > > >   vector(2) <signed-boolean:64> _1;
> > > > >   __v2di _8;
> > > > >
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   _1 = x_2(D) != y_3(D);
> > > > >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > >   _8 = f (a_4, b_5); [tail call]
> > > > >   return _8;
> > > > >
> > > > > which results in following code-gen:
> > > > > _Z3bazDv2_xS_:
> > > > > .LFB5666:
> > > > >         pcmpeqq %xmm1, %xmm0
> > > > >         pxor    %xmm2, %xmm2
> > > > >         movdqa  .LC0(%rip), %xmm1
> > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > >         pcmpeqq %xmm0, %xmm2
> > > > >         movdqa  %xmm2, %xmm0
> > > > >         jmp     _Z1fDv2_xS_
> > > >
> > > > Ok, thanks for checking.  I think the patch is OK but please let
> > > > Hongtao the chance to comment.
> > > >
> > > > Richard.
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > > >
> > > > > > Thanks,
> > > > > > Richard.
> > > > > >
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Prathamesh
> > > > > > > > > >
> > > > > > > > > > Richard.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Prathamesh
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguenther@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao

[-- Attachment #2: pr97872-3.diff --]
[-- Type: application/octet-stream, Size: 1392 bytes --]

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index d79c212748f..1ab75e72c2e 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -184,6 +184,20 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi,
 
 	  tree op0_type = TREE_TYPE (op0);
 	  tree op0a_type = TREE_TYPE (op0a);
+
+	  /* Try to fold x CMP y ? -1 : 0 to x CMP y.  */
+
+	  if (integer_minus_onep (op1)
+	      && integer_zerop (op2)
+	      && TYPE_MODE (TREE_TYPE (lhs)) == TYPE_MODE (TREE_TYPE (op0))
+	      && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
+	    {
+	      tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
+	      gassign *new_stmt = gimple_build_assign (lhs, conv_op);
+	      gsi_replace (gsi, new_stmt, true);
+	      return new_stmt;
+	    }
+
 	  if (used_vec_cond_exprs >= 2
 	      && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
 		  != CODE_FOR_nothing)
diff --git a/gcc/testsuite/gcc.target/arm/pr97872.c b/gcc/testsuite/gcc.target/arm/pr97872.c
new file mode 100644
index 00000000000..eeb4dd9d6bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr97872.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+uint8x8_t f1(int8x8_t a, int8x8_t b) {
+  return a < b;
+}
+
+/* { dg-final { scan-assembler-not "vbsl" } } */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-08  9:06                         ` Prathamesh Kulkarni
@ 2020-12-09 11:47                           ` Prathamesh Kulkarni
  2020-12-10  1:30                             ` Hongtao Liu
  2020-12-10  7:15                             ` Richard Biener
  0 siblings, 2 replies; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-09 11:47 UTC (permalink / raw)
  To: Hongtao Liu; +Cc: Richard Biener, GCC Development, Liu, Hongtao, gcc Patches

On Tue, 8 Dec 2020 at 14:36, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
>
> On Mon, 7 Dec 2020 at 17:37, Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
> > <prathamesh.kulkarni@linaro.org> wrote:
> > >
> > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > > > > > >
> > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > >
> > > > It seems to me that all you need to do is relax the predicates of op1
> > > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > > debugged it, but I see that vcondmn in neon.md only accepts
> > > > s_register_operand.
> > > >
> > > > (define_expand "vcond<mode><mode>"
> > > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > > >         (if_then_else:VDQW
> > > >           (match_operator 3 "comparison_operator"
> > > >             [(match_operand:VDQW 4 "s_register_operand")
> > > >              (match_operand:VDQW 5 "reg_or_zero_operand")])
> > > >           (match_operand:VDQW 1 "s_register_operand")
> > > >           (match_operand:VDQW 2 "s_register_operand")))]
> > > >   "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> > > > {
> > > >   arm_expand_vcond (operands, <V_cmp_result>mode);
> > > >   DONE;
> > > > })
> > > >
> > > > in sse.md it's defined as
> > > > (define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
> > > >   [(set (match_operand:V_512 0 "register_operand")
> > > >         (if_then_else:V_512
> > > >           (match_operator 3 ""
> > > >             [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > > >              (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > > >           (match_operand:V_512 1 "general_operand")
> > > >           (match_operand:V_512 2 "general_operand")))]
> > > >   "TARGET_AVX512F
> > > >    && (GET_MODE_NUNITS (<V_512:MODE>mode)
> > > >        == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
> > > > {
> > > >   bool ok = ix86_expand_int_vcond (operands);
> > > >   gcc_assert (ok);
> > > >   DONE;
> > > > })
> > > >
> > > > then we can get operands[1] and operands[2] as
> > > >
> > > > (gdb) p debug_rtx (operands[1])
> > > >  (const_vector:V16QI [
> > > >         (const_int -1 [0xffffffffffffffff]) repeated x16
> > > >     ])
> > > > (gdb) p debug_rtx (operands[2])
> > > > (reg:V16QI 82 [ _2 ])
> > > > (const_vector:V16QI [
> > > >         (const_int 0 [0]) repeated x16
> > > >     ])
> > > Hi Hongtao,
> > > Thanks for the suggestions!
> > > However IIUC from vector extensions doc page, the result of vector
> > > comparison is defined to be 0
> > > or -1, so would it be better to canonicalize
> > > x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> > > adjust targets if required ?
> >
> > Yes, it would be more straightforward to handle it in gimple isel, I
> > would adjust the backend and testcase after you check in the patch.
> Thanks! I have committed the attached patch in
> 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1.
Hi,
I was looking at similar issue in PR97903 and wondering what will be the
right approach to lower (a & b) != 0 to vtst ?
For test-case:
#include <arm_neon.h>

int8x8_t f1(int8x8_t a, int8x8_t b) {
  return (a & b) != 0;
}

input to ISEL:
  int8x8_t _1;
  vector(8) <signed-boolean:8> _2;
  int8x8_t _5;

  <bb 2> [local count: 1073741824]:
  _1 = a_3(D) & b_4(D);
  _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
  _5 = VEC_COND_EXPR <_2, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0,
0, 0, 0, 0, 0, 0 }>;
  return _5;

and after the PR97872 fix, ISEL now generates:
  int8x8_t _1;
  vector(8) <signed-boolean:8> _2;
  int8x8_t _5;

  <bb 2> [local count: 1073741824]:
  _1 = a_3(D) & b_4(D);
  _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
  _5 = VIEW_CONVERT_EXPR<int8x8_t>(_2);
  return _5;

while before patch it lowered to:
  _1 = a_3(D) & b_4(D);
  _5 = .VCOND (_1, { 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1,
-1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }, 114);

Should we fold
_1 = a & b
_2 = _1 != 0
to .VCOND (dest, -1, 0, _1, 0, AND_EXR) and as Hongtao suggested,
relax predicates for op1, op2 operands to match for -1, 0 respectively
in vcond and do specialized expansion there ?

In that case, I am not sure if folding x cmp y ? -1 : 0 to x cmp y
offers much benefit in the first place ?
On GIMPLE, the comparisons can be lowered to .VCOND, and the target
does specialized expansion depending on the operands.

Thanks,
Prathamesh
>
> Regards,
> Prathamesh
> >
> > > Alternatively, I could try fixing this in backend as you suggest above.
> > >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > > > > > _1 = a < b
> > > > > > > > > > > > x = _1 ? -1 : 0
> > > > > > > > > > > > with
> > > > > > > > > > > > x = view_convert_expr<_1>
> > > > > > > > > > > >
> > > > > > > > > > > > For the test-case, isel generates:
> > > > > > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > > > > > >   vector(8) signed char _2;
> > > > > > > > > > > >   uint8x8_t _5;
> > > > > > > > > > > >
> > > > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > > > > > >   return _5;
> > > > > > > > > > > >
> > > > > > > > > > > > and results in desired code-gen:
> > > > > > > > > > > > f1:
> > > > > > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > > > > > >         bx      lr
> > > > > > > > > > > >
> > > > > > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > > > > > and result in:
> > > > > > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > > > > > >
> > > > > > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > > > > > >
> > > > > > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > > > > > +
> > > > > > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > > > > > +    {
> > > > > > > > > > >
> > > > > > > > > > > So this really belongs here:
> > > > > > > > > > >
> > > > > > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > >
> > > > > > > > > > > <---
> > > > > > > > > > >
> > > > > > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > > > > > >                   != CODE_FOR_nothing)
> > > > > > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > >             {
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > > > > > +
> > > > > > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > > +       {
> > > > > > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > > > > > >
> > > > > > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > > > > > a valid cmp icode, that
> > > > > > > > > > should be sufficient ?
> > > > > > > > >
> > > > > > > > > Yes
> > > > > > > > Hi Richard,
> > > > > > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > > > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > > > > > For the test-case:
> > > > > > > > __v2di
> > > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > > {
> > > > > > > >   return x != y;
> > > > > > > > }
> > > > > > > >
> > > > > > > > Before patch:
> > > > > > > > baz:
> > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > > >         pandn   %xmm1, %xmm0
> > > > > > > >         ret
> > > > > > > >
> > > > > > > > After patch,
> > > > > > > > Before ISEL:
> > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > >   __v2di _4;
> > > > > > > >
> > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > > > > > >   return _4;
> > > > > > > >
> > > > > > > > After ISEL:
> > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > >   __v2di _4;
> > > > > > > >
> > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > > >   return _4;
> > > > > > > >
> > > > > > > > which results in:
> > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > >         pxor    %xmm1, %xmm1
> > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > >         ret
> > seems better to be
> >
> >          pcmpeqq %xmm1, %xmm0
> >          pxor    %xmm1, %xmm1
> >          pxor %xmm1, %xmm0
> >          ret
> >
> > Anyway, it needs backend adjustment.
> >
> > > > > > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > > > > > comparing the result against zero to invert it, so it looks correct ?
> > > > > > > > I am not sure which of the above two sequences is better tho ?
> > > > > > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > > > > > >
> > > > > > > In case pcmpeqq is double-issue the first variant might be faster while
> > > > > > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > > > > > pcmpeqq might have an issue.
> > > > > > >
> > > > > > > I think on GIMPLE the new code is preferable and adjustments are
> > > > > > > target business.  I wouldn't be surprised if the x86 backend
> > > > > > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > > > > > variant.
> > > > > > >
> > > > > > > Did you check how
> > > > > > >
> > > > > > > a = x != y ? { -1, -1 } : {0, 0 };
> > > > > > > b = x != y ? { 1, 2 } : { 3, 4 };
> > > > > > >
> > > > > > > is handled before/after your patch?  That is, make the comparison
> > > > > > > CSEd between two VEC_COND_EXPRs?
> > > > > > For the test-case:
> > > > > > __v2di f(__v2di, __v2di);
> > > > > >
> > > > > > __v2di
> > > > > > baz (const __v2di x, const __v2di y)
> > > > > > {
> > > > > >   __v2di a = (x != y);
> > > > > >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> > > > > >   return f (a, b);
> > > > > > }
> > > > > >
> > > > > > Before patch, isel converts both to .vcondeq:
> > > > > >   __v2di b;
> > > > > >   __v2di a;
> > > > > >   __v2di _8;
> > > > > >
> > > > > >   <bb 2> [local count: 1073741824]:
> > > > > >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > >   return _8;
> > > > > >
> > > > > > and results in following code-gen:
> > > > > > _Z3bazDv2_xS_:
> > > > > > .LFB5666:
> > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > >         movdqa  %xmm0, %xmm2
> > > > > >         pandn   %xmm1, %xmm2
> > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > >         movdqa  %xmm2, %xmm0
> > > > > >         jmp     _Z1fDv2_xS_
> > > > > >
> > > > > > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > > > > > view_convert_expr and the other
> > > > > > to vcondeq:
> > > > > >   __v2di b;
> > > > > >   __v2di a;
> > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > >   __v2di _8;
> > > > > >
> > > > > >   <bb 2> [local count: 1073741824]:
> > > > > >   _1 = x_2(D) != y_3(D);
> > > > > >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > >   return _8;
> > > > > >
> > > > > > which results in following code-gen:
> > > > > > _Z3bazDv2_xS_:
> > > > > > .LFB5666:
> > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > >         pxor    %xmm2, %xmm2
> > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > >         pcmpeqq %xmm0, %xmm2
> > > > > >         movdqa  %xmm2, %xmm0
> > > > > >         jmp     _Z1fDv2_xS_
> > > > >
> > > > > Ok, thanks for checking.  I think the patch is OK but please let
> > > > > Hongtao the chance to comment.
> > > > >
> > > > > Richard.
> > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Richard.
> > > > > > >
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prathamesh
> > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Prathamesh
> > > > > > > > > > >
> > > > > > > > > > > Richard.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-09 11:47                           ` Prathamesh Kulkarni
@ 2020-12-10  1:30                             ` Hongtao Liu
  2020-12-10  7:15                             ` Richard Biener
  1 sibling, 0 replies; 18+ messages in thread
From: Hongtao Liu @ 2020-12-10  1:30 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Richard Biener, GCC Development, Liu, Hongtao, gcc Patches

It seems better with your PR97872 fix on i386.

Cat test.c

typedef char v16qi __attribute__ ((vector_size(16)));
v16qi f1(v16qi a, v16qi b) {
return (a & b) != 0;
}

before

f1(char __vector(16), char __vector(16)):
pand %xmm1, %xmm0
pxor %xmm1, %xmm1
pcmpeqb %xmm1, %xmm0
pcmpeqd %xmm1, %xmm1
pandn %xmm1, %xmm0
ret

After the pr97872 fix

f1(char __vector(16), char __vector(16)):
pand xmm0, xmm1
pxor xmm1, xmm1
pcmpeqb xmm0, xmm1
pcmpeqb xmm0, xmm1
ret

On Wed, Dec 9, 2020 at 7:47 PM Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
>
> On Tue, 8 Dec 2020 at 14:36, Prathamesh Kulkarni
> <prathamesh.kulkarni@linaro.org> wrote:
> >
> > On Mon, 7 Dec 2020 at 17:37, Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
> > > <prathamesh.kulkarni@linaro.org> wrote:
> > > >
> > > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu <crazylht@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
> > > > > >
> > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > >
> > > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > >
> > > > > It seems to me that all you need to do is relax the predicates of op1
> > > > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > > > debugged it, but I see that vcondmn in neon.md only accepts
> > > > > s_register_operand.
> > > > >
> > > > > (define_expand "vcond<mode><mode>"
> > > > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > > > >         (if_then_else:VDQW
> > > > >           (match_operator 3 "comparison_operator"
> > > > >             [(match_operand:VDQW 4 "s_register_operand")
> > > > >              (match_operand:VDQW 5 "reg_or_zero_operand")])
> > > > >           (match_operand:VDQW 1 "s_register_operand")
> > > > >           (match_operand:VDQW 2 "s_register_operand")))]
> > > > >   "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> > > > > {
> > > > >   arm_expand_vcond (operands, <V_cmp_result>mode);
> > > > >   DONE;
> > > > > })
> > > > >
> > > > > in sse.md it's defined as
> > > > > (define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
> > > > >   [(set (match_operand:V_512 0 "register_operand")
> > > > >         (if_then_else:V_512
> > > > >           (match_operator 3 ""
> > > > >             [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > > > >              (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > > > >           (match_operand:V_512 1 "general_operand")
> > > > >           (match_operand:V_512 2 "general_operand")))]
> > > > >   "TARGET_AVX512F
> > > > >    && (GET_MODE_NUNITS (<V_512:MODE>mode)
> > > > >        == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
> > > > > {
> > > > >   bool ok = ix86_expand_int_vcond (operands);
> > > > >   gcc_assert (ok);
> > > > >   DONE;
> > > > > })
> > > > >
> > > > > then we can get operands[1] and operands[2] as
> > > > >
> > > > > (gdb) p debug_rtx (operands[1])
> > > > >  (const_vector:V16QI [
> > > > >         (const_int -1 [0xffffffffffffffff]) repeated x16
> > > > >     ])
> > > > > (gdb) p debug_rtx (operands[2])
> > > > > (reg:V16QI 82 [ _2 ])
> > > > > (const_vector:V16QI [
> > > > >         (const_int 0 [0]) repeated x16
> > > > >     ])
> > > > Hi Hongtao,
> > > > Thanks for the suggestions!
> > > > However IIUC from vector extensions doc page, the result of vector
> > > > comparison is defined to be 0
> > > > or -1, so would it be better to canonicalize
> > > > x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> > > > adjust targets if required ?
> > >
> > > Yes, it would be more straightforward to handle it in gimple isel, I
> > > would adjust the backend and testcase after you check in the patch.
> > Thanks! I have committed the attached patch in
> > 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1.
> Hi,
> I was looking at similar issue in PR97903 and wondering what will be the
> right approach to lower (a & b) != 0 to vtst ?
> For test-case:
> #include <arm_neon.h>
>
> int8x8_t f1(int8x8_t a, int8x8_t b) {
>   return (a & b) != 0;
> }
>
> input to ISEL:
>   int8x8_t _1;
>   vector(8) <signed-boolean:8> _2;
>   int8x8_t _5;
>
>   <bb 2> [local count: 1073741824]:
>   _1 = a_3(D) & b_4(D);
>   _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
>   _5 = VEC_COND_EXPR <_2, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0,
> 0, 0, 0, 0, 0, 0 }>;
>   return _5;
>
> and after the PR97872 fix, ISEL now generates:
>   int8x8_t _1;
>   vector(8) <signed-boolean:8> _2;
>   int8x8_t _5;
>
>   <bb 2> [local count: 1073741824]:
>   _1 = a_3(D) & b_4(D);
>   _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
>   _5 = VIEW_CONVERT_EXPR<int8x8_t>(_2);
>   return _5;
>
> while before patch it lowered to:
>   _1 = a_3(D) & b_4(D);
>   _5 = .VCOND (_1, { 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1,
> -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }, 114);
>
> Should we fold
> _1 = a & b
> _2 = _1 != 0
> to .VCOND (dest, -1, 0, _1, 0, AND_EXR) and as Hongtao suggested,
> relax predicates for op1, op2 operands to match for -1, 0 respectively
> in vcond and do specialized expansion there ?
>
> In that case, I am not sure if folding x cmp y ? -1 : 0 to x cmp y
> offers much benefit in the first place ?
> On GIMPLE, the comparisons can be lowered to .VCOND, and the target
> does specialized expansion depending on the operands.
>
> Thanks,
> Prathamesh
> >
> > Regards,
> > Prathamesh
> > >
> > > > Alternatively, I could try fixing this in backend as you suggest above.
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > > > > > > _1 = a < b
> > > > > > > > > > > > > x = _1 ? -1 : 0
> > > > > > > > > > > > > with
> > > > > > > > > > > > > x = view_convert_expr<_1>
> > > > > > > > > > > > >
> > > > > > > > > > > > > For the test-case, isel generates:
> > > > > > > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > > > > > > >   vector(8) signed char _2;
> > > > > > > > > > > > >   uint8x8_t _5;
> > > > > > > > > > > > >
> > > > > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > > > > > > >   return _5;
> > > > > > > > > > > > >
> > > > > > > > > > > > > and results in desired code-gen:
> > > > > > > > > > > > > f1:
> > > > > > > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > > > > > > >         bx      lr
> > > > > > > > > > > > >
> > > > > > > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > > > > > > and result in:
> > > > > > > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > > > > > > >
> > > > > > > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > > > > > > >
> > > > > > > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > > > > > > +
> > > > > > > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > > > > > > +    {
> > > > > > > > > > > >
> > > > > > > > > > > > So this really belongs here:
> > > > > > > > > > > >
> > > > > > > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > > >
> > > > > > > > > > > > <---
> > > > > > > > > > > >
> > > > > > > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > > > > > > >                   != CODE_FOR_nothing)
> > > > > > > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > > >             {
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > > > > > > +
> > > > > > > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > > > +       {
> > > > > > > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > > > > > > >
> > > > > > > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > > > > > > a valid cmp icode, that
> > > > > > > > > > > should be sufficient ?
> > > > > > > > > >
> > > > > > > > > > Yes
> > > > > > > > > Hi Richard,
> > > > > > > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > > > > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > > > > > > For the test-case:
> > > > > > > > > __v2di
> > > > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > > > {
> > > > > > > > >   return x != y;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > Before patch:
> > > > > > > > > baz:
> > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > > > >         pandn   %xmm1, %xmm0
> > > > > > > > >         ret
> > > > > > > > >
> > > > > > > > > After patch,
> > > > > > > > > Before ISEL:
> > > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > > >   __v2di _4;
> > > > > > > > >
> > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > > > > > > >   return _4;
> > > > > > > > >
> > > > > > > > > After ISEL:
> > > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > > >   __v2di _4;
> > > > > > > > >
> > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > > > >   return _4;
> > > > > > > > >
> > > > > > > > > which results in:
> > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > >         pxor    %xmm1, %xmm1
> > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > >         ret
> > > seems better to be
> > >
> > >          pcmpeqq %xmm1, %xmm0
> > >          pxor    %xmm1, %xmm1
> > >          pxor %xmm1, %xmm0
> > >          ret
> > >
> > > Anyway, it needs backend adjustment.
> > >
> > > > > > > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > > > > > > comparing the result against zero to invert it, so it looks correct ?
> > > > > > > > > I am not sure which of the above two sequences is better tho ?
> > > > > > > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > > > > > > >
> > > > > > > > In case pcmpeqq is double-issue the first variant might be faster while
> > > > > > > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > > > > > > pcmpeqq might have an issue.
> > > > > > > >
> > > > > > > > I think on GIMPLE the new code is preferable and adjustments are
> > > > > > > > target business.  I wouldn't be surprised if the x86 backend
> > > > > > > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > > > > > > variant.
> > > > > > > >
> > > > > > > > Did you check how
> > > > > > > >
> > > > > > > > a = x != y ? { -1, -1 } : {0, 0 };
> > > > > > > > b = x != y ? { 1, 2 } : { 3, 4 };
> > > > > > > >
> > > > > > > > is handled before/after your patch?  That is, make the comparison
> > > > > > > > CSEd between two VEC_COND_EXPRs?
> > > > > > > For the test-case:
> > > > > > > __v2di f(__v2di, __v2di);
> > > > > > >
> > > > > > > __v2di
> > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > {
> > > > > > >   __v2di a = (x != y);
> > > > > > >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> > > > > > >   return f (a, b);
> > > > > > > }
> > > > > > >
> > > > > > > Before patch, isel converts both to .vcondeq:
> > > > > > >   __v2di b;
> > > > > > >   __v2di a;
> > > > > > >   __v2di _8;
> > > > > > >
> > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> > > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > > >   return _8;
> > > > > > >
> > > > > > > and results in following code-gen:
> > > > > > > _Z3bazDv2_xS_:
> > > > > > > .LFB5666:
> > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > >         movdqa  %xmm0, %xmm2
> > > > > > >         pandn   %xmm1, %xmm2
> > > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > > >         movdqa  %xmm2, %xmm0
> > > > > > >         jmp     _Z1fDv2_xS_
> > > > > > >
> > > > > > > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > > > > > > view_convert_expr and the other
> > > > > > > to vcondeq:
> > > > > > >   __v2di b;
> > > > > > >   __v2di a;
> > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > >   __v2di _8;
> > > > > > >
> > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > > >   return _8;
> > > > > > >
> > > > > > > which results in following code-gen:
> > > > > > > _Z3bazDv2_xS_:
> > > > > > > .LFB5666:
> > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > >         pxor    %xmm2, %xmm2
> > > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > > >         pcmpeqq %xmm0, %xmm2
> > > > > > >         movdqa  %xmm2, %xmm0
> > > > > > >         jmp     _Z1fDv2_xS_
> > > > > >
> > > > > > Ok, thanks for checking.  I think the patch is OK but please let
> > > > > > Hongtao the chance to comment.
> > > > > >
> > > > > > Richard.
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Richard.
> > > > > > > >
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Prathamesh
> > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Prathamesh
> > > > > > > > > > > >
> > > > > > > > > > > > Richard.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguenther@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > BR,
> > > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-09 11:47                           ` Prathamesh Kulkarni
  2020-12-10  1:30                             ` Hongtao Liu
@ 2020-12-10  7:15                             ` Richard Biener
  2020-12-10 13:00                               ` Prathamesh Kulkarni
  1 sibling, 1 reply; 18+ messages in thread
From: Richard Biener @ 2020-12-10  7:15 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Hongtao Liu, GCC Development, Liu, Hongtao, gcc Patches

On Wed, 9 Dec 2020, Prathamesh Kulkarni wrote:

> On Tue, 8 Dec 2020 at 14:36, Prathamesh Kulkarni
> <prathamesh.kulkarni@linaro.org> wrote:
> >
> > On Mon, 7 Dec 2020 at 17:37, Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
> > > <prathamesh.kulkarni@linaro.org> wrote:
> > > >
> > > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu <crazylht@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
> > > > > >
> > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > >
> > > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > >
> > > > > It seems to me that all you need to do is relax the predicates of op1
> > > > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > > > debugged it, but I see that vcondmn in neon.md only accepts
> > > > > s_register_operand.
> > > > >
> > > > > (define_expand "vcond<mode><mode>"
> > > > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > > > >         (if_then_else:VDQW
> > > > >           (match_operator 3 "comparison_operator"
> > > > >             [(match_operand:VDQW 4 "s_register_operand")
> > > > >              (match_operand:VDQW 5 "reg_or_zero_operand")])
> > > > >           (match_operand:VDQW 1 "s_register_operand")
> > > > >           (match_operand:VDQW 2 "s_register_operand")))]
> > > > >   "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> > > > > {
> > > > >   arm_expand_vcond (operands, <V_cmp_result>mode);
> > > > >   DONE;
> > > > > })
> > > > >
> > > > > in sse.md it's defined as
> > > > > (define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
> > > > >   [(set (match_operand:V_512 0 "register_operand")
> > > > >         (if_then_else:V_512
> > > > >           (match_operator 3 ""
> > > > >             [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > > > >              (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > > > >           (match_operand:V_512 1 "general_operand")
> > > > >           (match_operand:V_512 2 "general_operand")))]
> > > > >   "TARGET_AVX512F
> > > > >    && (GET_MODE_NUNITS (<V_512:MODE>mode)
> > > > >        == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
> > > > > {
> > > > >   bool ok = ix86_expand_int_vcond (operands);
> > > > >   gcc_assert (ok);
> > > > >   DONE;
> > > > > })
> > > > >
> > > > > then we can get operands[1] and operands[2] as
> > > > >
> > > > > (gdb) p debug_rtx (operands[1])
> > > > >  (const_vector:V16QI [
> > > > >         (const_int -1 [0xffffffffffffffff]) repeated x16
> > > > >     ])
> > > > > (gdb) p debug_rtx (operands[2])
> > > > > (reg:V16QI 82 [ _2 ])
> > > > > (const_vector:V16QI [
> > > > >         (const_int 0 [0]) repeated x16
> > > > >     ])
> > > > Hi Hongtao,
> > > > Thanks for the suggestions!
> > > > However IIUC from vector extensions doc page, the result of vector
> > > > comparison is defined to be 0
> > > > or -1, so would it be better to canonicalize
> > > > x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> > > > adjust targets if required ?
> > >
> > > Yes, it would be more straightforward to handle it in gimple isel, I
> > > would adjust the backend and testcase after you check in the patch.
> > Thanks! I have committed the attached patch in
> > 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1.
> Hi,
> I was looking at similar issue in PR97903 and wondering what will be the
> right approach to lower (a & b) != 0 to vtst ?
> For test-case:
> #include <arm_neon.h>
> 
> int8x8_t f1(int8x8_t a, int8x8_t b) {
>   return (a & b) != 0;
> }
> 
> input to ISEL:
>   int8x8_t _1;
>   vector(8) <signed-boolean:8> _2;
>   int8x8_t _5;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = a_3(D) & b_4(D);
>   _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
>   _5 = VEC_COND_EXPR <_2, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0,
> 0, 0, 0, 0, 0, 0 }>;
>   return _5;
> 
> and after the PR97872 fix, ISEL now generates:
>   int8x8_t _1;
>   vector(8) <signed-boolean:8> _2;
>   int8x8_t _5;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = a_3(D) & b_4(D);
>   _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
>   _5 = VIEW_CONVERT_EXPR<int8x8_t>(_2);
>   return _5;
> 
> while before patch it lowered to:
>   _1 = a_3(D) & b_4(D);
>   _5 = .VCOND (_1, { 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1,
> -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }, 114);
> 
> Should we fold
> _1 = a & b
> _2 = _1 != 0
> to .VCOND (dest, -1, 0, _1, 0, AND_EXR) and as Hongtao suggested,
> relax predicates for op1, op2 operands to match for -1, 0 respectively
> in vcond and do specialized expansion there ?
> 
> In that case, I am not sure if folding x cmp y ? -1 : 0 to x cmp y
> offers much benefit in the first place ?
> On GIMPLE, the comparisons can be lowered to .VCOND, and the target
> does specialized expansion depending on the operands.

So vtst can do (a & b) != 0 directly?  It looks like that's a job
for combine then and the feeding GIMPLE shouldn't matter too much
since independent of the choice above you'll get RTL to compute the
condition [mask] from the result of a vector AND.

Richard.

> Thanks,
> Prathamesh
> >
> > Regards,
> > Prathamesh
> > >
> > > > Alternatively, I could try fixing this in backend as you suggest above.
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > > > > > > _1 = a < b
> > > > > > > > > > > > > x = _1 ? -1 : 0
> > > > > > > > > > > > > with
> > > > > > > > > > > > > x = view_convert_expr<_1>
> > > > > > > > > > > > >
> > > > > > > > > > > > > For the test-case, isel generates:
> > > > > > > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > > > > > > >   vector(8) signed char _2;
> > > > > > > > > > > > >   uint8x8_t _5;
> > > > > > > > > > > > >
> > > > > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > > > > > > >   return _5;
> > > > > > > > > > > > >
> > > > > > > > > > > > > and results in desired code-gen:
> > > > > > > > > > > > > f1:
> > > > > > > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > > > > > > >         bx      lr
> > > > > > > > > > > > >
> > > > > > > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > > > > > > and result in:
> > > > > > > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > > > > > > >
> > > > > > > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > > > > > > >
> > > > > > > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > > > > > > +
> > > > > > > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > > > > > > +    {
> > > > > > > > > > > >
> > > > > > > > > > > > So this really belongs here:
> > > > > > > > > > > >
> > > > > > > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > > >
> > > > > > > > > > > > <---
> > > > > > > > > > > >
> > > > > > > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > > > > > > >                   != CODE_FOR_nothing)
> > > > > > > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > > >             {
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > > > > > > +
> > > > > > > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > > > +       {
> > > > > > > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > > > > > > >
> > > > > > > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > > > > > > a valid cmp icode, that
> > > > > > > > > > > should be sufficient ?
> > > > > > > > > >
> > > > > > > > > > Yes
> > > > > > > > > Hi Richard,
> > > > > > > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > > > > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > > > > > > For the test-case:
> > > > > > > > > __v2di
> > > > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > > > {
> > > > > > > > >   return x != y;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > Before patch:
> > > > > > > > > baz:
> > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > > > >         pandn   %xmm1, %xmm0
> > > > > > > > >         ret
> > > > > > > > >
> > > > > > > > > After patch,
> > > > > > > > > Before ISEL:
> > > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > > >   __v2di _4;
> > > > > > > > >
> > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > > > > > > >   return _4;
> > > > > > > > >
> > > > > > > > > After ISEL:
> > > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > > >   __v2di _4;
> > > > > > > > >
> > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > > > >   return _4;
> > > > > > > > >
> > > > > > > > > which results in:
> > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > >         pxor    %xmm1, %xmm1
> > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > >         ret
> > > seems better to be
> > >
> > >          pcmpeqq %xmm1, %xmm0
> > >          pxor    %xmm1, %xmm1
> > >          pxor %xmm1, %xmm0
> > >          ret
> > >
> > > Anyway, it needs backend adjustment.
> > >
> > > > > > > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > > > > > > comparing the result against zero to invert it, so it looks correct ?
> > > > > > > > > I am not sure which of the above two sequences is better tho ?
> > > > > > > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > > > > > > >
> > > > > > > > In case pcmpeqq is double-issue the first variant might be faster while
> > > > > > > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > > > > > > pcmpeqq might have an issue.
> > > > > > > >
> > > > > > > > I think on GIMPLE the new code is preferable and adjustments are
> > > > > > > > target business.  I wouldn't be surprised if the x86 backend
> > > > > > > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > > > > > > variant.
> > > > > > > >
> > > > > > > > Did you check how
> > > > > > > >
> > > > > > > > a = x != y ? { -1, -1 } : {0, 0 };
> > > > > > > > b = x != y ? { 1, 2 } : { 3, 4 };
> > > > > > > >
> > > > > > > > is handled before/after your patch?  That is, make the comparison
> > > > > > > > CSEd between two VEC_COND_EXPRs?
> > > > > > > For the test-case:
> > > > > > > __v2di f(__v2di, __v2di);
> > > > > > >
> > > > > > > __v2di
> > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > {
> > > > > > >   __v2di a = (x != y);
> > > > > > >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> > > > > > >   return f (a, b);
> > > > > > > }
> > > > > > >
> > > > > > > Before patch, isel converts both to .vcondeq:
> > > > > > >   __v2di b;
> > > > > > >   __v2di a;
> > > > > > >   __v2di _8;
> > > > > > >
> > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> > > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > > >   return _8;
> > > > > > >
> > > > > > > and results in following code-gen:
> > > > > > > _Z3bazDv2_xS_:
> > > > > > > .LFB5666:
> > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > >         movdqa  %xmm0, %xmm2
> > > > > > >         pandn   %xmm1, %xmm2
> > > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > > >         movdqa  %xmm2, %xmm0
> > > > > > >         jmp     _Z1fDv2_xS_
> > > > > > >
> > > > > > > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > > > > > > view_convert_expr and the other
> > > > > > > to vcondeq:
> > > > > > >   __v2di b;
> > > > > > >   __v2di a;
> > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > >   __v2di _8;
> > > > > > >
> > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > > >   return _8;
> > > > > > >
> > > > > > > which results in following code-gen:
> > > > > > > _Z3bazDv2_xS_:
> > > > > > > .LFB5666:
> > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > >         pxor    %xmm2, %xmm2
> > > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > > >         pcmpeqq %xmm0, %xmm2
> > > > > > >         movdqa  %xmm2, %xmm0
> > > > > > >         jmp     _Z1fDv2_xS_
> > > > > >
> > > > > > Ok, thanks for checking.  I think the patch is OK but please let
> > > > > > Hongtao the chance to comment.
> > > > > >
> > > > > > Richard.
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Richard.
> > > > > > > >
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Prathamesh
> > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Prathamesh
> > > > > > > > > > > >
> > > > > > > > > > > > Richard.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguenther@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > BR,
> > > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Help with PR97872
  2020-12-10  7:15                             ` Richard Biener
@ 2020-12-10 13:00                               ` Prathamesh Kulkarni
  0 siblings, 0 replies; 18+ messages in thread
From: Prathamesh Kulkarni @ 2020-12-10 13:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: Hongtao Liu, GCC Development, Liu, Hongtao, gcc Patches

On Thu, 10 Dec 2020 at 17:11, Richard Biener <rguenther@suse.de> wrote:
>
> On Wed, 9 Dec 2020, Prathamesh Kulkarni wrote:
>
> > On Tue, 8 Dec 2020 at 14:36, Prathamesh Kulkarni
> > <prathamesh.kulkarni@linaro.org> wrote:
> > >
> > > On Mon, 7 Dec 2020 at 17:37, Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
> > > > <prathamesh.kulkarni@linaro.org> wrote:
> > > > >
> > > > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu <crazylht@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener <rguenther@suse.de> wrote:
> > > > > > >
> > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguenther@suse.de> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > > > > > > > > > > specialized expansion for vcond in target when operands are -1 and 0.
> > > > > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > where r117 and r118 are set to vector constants -1 and 0 respectively.
> > > > > > > > > > > > > > > > However, I am not sure if there's a way to check if the register is
> > > > > > > > > > > > > > > > constant during expansion time (since we don't have df analysis yet) ?
> > > > > >
> > > > > > It seems to me that all you need to do is relax the predicates of op1
> > > > > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > > > > debugged it, but I see that vcondmn in neon.md only accepts
> > > > > > s_register_operand.
> > > > > >
> > > > > > (define_expand "vcond<mode><mode>"
> > > > > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > > > > >         (if_then_else:VDQW
> > > > > >           (match_operator 3 "comparison_operator"
> > > > > >             [(match_operand:VDQW 4 "s_register_operand")
> > > > > >              (match_operand:VDQW 5 "reg_or_zero_operand")])
> > > > > >           (match_operand:VDQW 1 "s_register_operand")
> > > > > >           (match_operand:VDQW 2 "s_register_operand")))]
> > > > > >   "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
> > > > > > {
> > > > > >   arm_expand_vcond (operands, <V_cmp_result>mode);
> > > > > >   DONE;
> > > > > > })
> > > > > >
> > > > > > in sse.md it's defined as
> > > > > > (define_expand "vcondu<V_512:mode><VI_AVX512BW:mode>"
> > > > > >   [(set (match_operand:V_512 0 "register_operand")
> > > > > >         (if_then_else:V_512
> > > > > >           (match_operator 3 ""
> > > > > >             [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > > > > >              (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > > > > >           (match_operand:V_512 1 "general_operand")
> > > > > >           (match_operand:V_512 2 "general_operand")))]
> > > > > >   "TARGET_AVX512F
> > > > > >    && (GET_MODE_NUNITS (<V_512:MODE>mode)
> > > > > >        == GET_MODE_NUNITS (<VI_AVX512BW:MODE>mode))"
> > > > > > {
> > > > > >   bool ok = ix86_expand_int_vcond (operands);
> > > > > >   gcc_assert (ok);
> > > > > >   DONE;
> > > > > > })
> > > > > >
> > > > > > then we can get operands[1] and operands[2] as
> > > > > >
> > > > > > (gdb) p debug_rtx (operands[1])
> > > > > >  (const_vector:V16QI [
> > > > > >         (const_int -1 [0xffffffffffffffff]) repeated x16
> > > > > >     ])
> > > > > > (gdb) p debug_rtx (operands[2])
> > > > > > (reg:V16QI 82 [ _2 ])
> > > > > > (const_vector:V16QI [
> > > > > >         (const_int 0 [0]) repeated x16
> > > > > >     ])
> > > > > Hi Hongtao,
> > > > > Thanks for the suggestions!
> > > > > However IIUC from vector extensions doc page, the result of vector
> > > > > comparison is defined to be 0
> > > > > or -1, so would it be better to canonicalize
> > > > > x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> > > > > adjust targets if required ?
> > > >
> > > > Yes, it would be more straightforward to handle it in gimple isel, I
> > > > would adjust the backend and testcase after you check in the patch.
> > > Thanks! I have committed the attached patch in
> > > 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1.
> > Hi,
> > I was looking at similar issue in PR97903 and wondering what will be the
> > right approach to lower (a & b) != 0 to vtst ?
> > For test-case:
> > #include <arm_neon.h>
> >
> > int8x8_t f1(int8x8_t a, int8x8_t b) {
> >   return (a & b) != 0;
> > }
> >
> > input to ISEL:
> >   int8x8_t _1;
> >   vector(8) <signed-boolean:8> _2;
> >   int8x8_t _5;
> >
> >   <bb 2> [local count: 1073741824]:
> >   _1 = a_3(D) & b_4(D);
> >   _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
> >   _5 = VEC_COND_EXPR <_2, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0,
> > 0, 0, 0, 0, 0, 0 }>;
> >   return _5;
> >
> > and after the PR97872 fix, ISEL now generates:
> >   int8x8_t _1;
> >   vector(8) <signed-boolean:8> _2;
> >   int8x8_t _5;
> >
> >   <bb 2> [local count: 1073741824]:
> >   _1 = a_3(D) & b_4(D);
> >   _2 = _1 != { 0, 0, 0, 0, 0, 0, 0, 0 };
> >   _5 = VIEW_CONVERT_EXPR<int8x8_t>(_2);
> >   return _5;
> >
> > while before patch it lowered to:
> >   _1 = a_3(D) & b_4(D);
> >   _5 = .VCOND (_1, { 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1,
> > -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }, 114);
> >
> > Should we fold
> > _1 = a & b
> > _2 = _1 != 0
> > to .VCOND (dest, -1, 0, _1, 0, AND_EXR) and as Hongtao suggested,
> > relax predicates for op1, op2 operands to match for -1, 0 respectively
> > in vcond and do specialized expansion there ?
> >
> > In that case, I am not sure if folding x cmp y ? -1 : 0 to x cmp y
> > offers much benefit in the first place ?
> > On GIMPLE, the comparisons can be lowered to .VCOND, and the target
> > does specialized expansion depending on the operands.
>
> So vtst can do (a & b) != 0 directly?  It looks like that's a job
> for combine then and the feeding GIMPLE shouldn't matter too much
> since independent of the choice above you'll get RTL to compute the
> condition [mask] from the result of a vector AND.
OK, thanks for the suggestions. So let's keep the PR97872 fix, since
it also improves
code-gen for x86. I will try to fix this on RTL for arm to generate vtst.

Thanks,
Prathamesh
>
> Richard.
>
> > Thanks,
> > Prathamesh
> > >
> > > Regards,
> > > Prathamesh
> > > >
> > > > > Alternatively, I could try fixing this in backend as you suggest above.
> > > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Alternatively, should we add a target hook that returns true if the
> > > > > > > > > > > > > > > > result of vector comparison is set to all-ones or all-zeros, and then
> > > > > > > > > > > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR into nop ?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > > > > > > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > > > > > > > > > > I think the vec_cmp pattern matches but it produces a masked vector type.
> > > > > > > > > > > > > > In the attached patch, I simply replaced:
> > > > > > > > > > > > > > _1 = a < b
> > > > > > > > > > > > > > x = _1 ? -1 : 0
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > x = view_convert_expr<_1>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > For the test-case, isel generates:
> > > > > > > > > > > > > >   vector(8) <signed-boolean:8> _1;
> > > > > > > > > > > > > >   vector(8) signed char _2;
> > > > > > > > > > > > > >   uint8x8_t _5;
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > > > > > >   _1 = a_3(D) < b_4(D);
> > > > > > > > > > > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > > > > > > > > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > > > > > > > > > > >   return _5;
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > and results in desired code-gen:
> > > > > > > > > > > > > > f1:
> > > > > > > > > > > > > >         vcgt.s8 d0, d1, d0
> > > > > > > > > > > > > >         bx      lr
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Altho I guess, we should remove the redundant conversions during isel itself ?
> > > > > > > > > > > > > > and result in:
> > > > > > > > > > > > > > _1 = a_3(D) < b_4(D)
> > > > > > > > > > > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Patch is lightly tested with only vect.exp)
> > > > > > > > > > > > >
> > > > > > > > > > > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > > > > > > > > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > > > > > > > > > > +
> > > > > > > > > > > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > > > > > > > > > > +    {
> > > > > > > > > > > > >
> > > > > > > > > > > > > So this really belongs here:
> > > > > > > > > > > > >
> > > > > > > > > > > > >           tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > > > >           tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > > > >
> > > > > > > > > > > > > <---
> > > > > > > > > > > > >
> > > > > > > > > > > > >           if (used_vec_cond_exprs >= 2
> > > > > > > > > > > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > > > > > > > > > > >                   != CODE_FOR_nothing)
> > > > > > > > > > > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > > > >             {
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (op0));
> > > > > > > > > > > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > > > > > > > > > > +      tree op0_type = TREE_TYPE (op0);
> > > > > > > > > > > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > > > > > > > > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > > > > > > > > > > +
> > > > > > > > > > > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > > > > > > > > > > +       {
> > > > > > > > > > > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
> > > > > > > > > > > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > > > > > > > > > > +         gsi_replace (gsi, new_stmt, true);
> > > > > > > > > > > > >
> > > > > > > > > > > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > > > > > > > > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > > > > > > > > > > Thanks for the suggestions, does the attached patch look OK ?
> > > > > > > > > > > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > > > > > > > > > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > > > > > > > > > > a valid cmp icode, that
> > > > > > > > > > > > should be sufficient ?
> > > > > > > > > > >
> > > > > > > > > > > Yes
> > > > > > > > > > Hi Richard,
> > > > > > > > > > I tested the patch, and it shows one regression for pr78102.c, because
> > > > > > > > > > of extra pcmpeqq in code-gen for x != y on x86.
> > > > > > > > > > For the test-case:
> > > > > > > > > > __v2di
> > > > > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > > > > {
> > > > > > > > > >   return x != y;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > Before patch:
> > > > > > > > > > baz:
> > > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > > > > >         pandn   %xmm1, %xmm0
> > > > > > > > > >         ret
> > > > > > > > > >
> > > > > > > > > > After patch,
> > > > > > > > > > Before ISEL:
> > > > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > > > >   __v2di _4;
> > > > > > > > > >
> > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > > > >   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
> > > > > > > > > >   return _4;
> > > > > > > > > >
> > > > > > > > > > After ISEL:
> > > > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > > > >   __v2di _4;
> > > > > > > > > >
> > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > > > >   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > > > > >   return _4;
> > > > > > > > > >
> > > > > > > > > > which results in:
> > > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > > >         pxor    %xmm1, %xmm1
> > > > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > > > >         ret
> > > > seems better to be
> > > >
> > > >          pcmpeqq %xmm1, %xmm0
> > > >          pxor    %xmm1, %xmm1
> > > >          pxor %xmm1, %xmm0
> > > >          ret
> > > >
> > > > Anyway, it needs backend adjustment.
> > > >
> > > > > > > > > > IIUC, the new code-gen is essentially comparing two args for equality, and then
> > > > > > > > > > comparing the result against zero to invert it, so it looks correct ?
> > > > > > > > > > I am not sure which of the above two sequences is better tho ?
> > > > > > > > > > If the new code-gen is OK, would it be OK to adjust the test-case ?
> > > > > > > > >
> > > > > > > > > In case pcmpeqq is double-issue the first variant might be faster while
> > > > > > > > > the second variant has the advantage of the "free" pxor, but back-to-back
> > > > > > > > > pcmpeqq might have an issue.
> > > > > > > > >
> > > > > > > > > I think on GIMPLE the new code is preferable and adjustments are
> > > > > > > > > target business.  I wouldn't be surprised if the x86 backend
> > > > > > > > > special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
> > > > > > > > > variant.
> > > > > > > > >
> > > > > > > > > Did you check how
> > > > > > > > >
> > > > > > > > > a = x != y ? { -1, -1 } : {0, 0 };
> > > > > > > > > b = x != y ? { 1, 2 } : { 3, 4 };
> > > > > > > > >
> > > > > > > > > is handled before/after your patch?  That is, make the comparison
> > > > > > > > > CSEd between two VEC_COND_EXPRs?
> > > > > > > > For the test-case:
> > > > > > > > __v2di f(__v2di, __v2di);
> > > > > > > >
> > > > > > > > __v2di
> > > > > > > > baz (const __v2di x, const __v2di y)
> > > > > > > > {
> > > > > > > >   __v2di a = (x != y);
> > > > > > > >   __v2di b = (x != y) ? (__v2di) {1, 2} : (__v2di) {3, 4};
> > > > > > > >   return f (a, b);
> > > > > > > > }
> > > > > > > >
> > > > > > > > Before patch, isel converts both to .vcondeq:
> > > > > > > >   __v2di b;
> > > > > > > >   __v2di a;
> > > > > > > >   __v2di _8;
> > > > > > > >
> > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > >   a_4 = .VCONDEQ (x_2(D), y_3(D), { -1, -1 }, { 0, 0 }, 114);
> > > > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > > > >   return _8;
> > > > > > > >
> > > > > > > > and results in following code-gen:
> > > > > > > > _Z3bazDv2_xS_:
> > > > > > > > .LFB5666:
> > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > >         pcmpeqd %xmm1, %xmm1
> > > > > > > >         movdqa  %xmm0, %xmm2
> > > > > > > >         pandn   %xmm1, %xmm2
> > > > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > > > >         movdqa  %xmm2, %xmm0
> > > > > > > >         jmp     _Z1fDv2_xS_
> > > > > > > >
> > > > > > > > With patch, isel converts a = (x != y) ? {-1, -1} : {0, 0} to
> > > > > > > > view_convert_expr and the other
> > > > > > > > to vcondeq:
> > > > > > > >   __v2di b;
> > > > > > > >   __v2di a;
> > > > > > > >   vector(2) <signed-boolean:64> _1;
> > > > > > > >   __v2di _8;
> > > > > > > >
> > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > >   _1 = x_2(D) != y_3(D);
> > > > > > > >   a_4 = VIEW_CONVERT_EXPR<__v2di>(_1);
> > > > > > > >   b_5 = .VCONDEQ (x_2(D), y_3(D), { 1, 2 }, { 3, 4 }, 114);
> > > > > > > >   _8 = f (a_4, b_5); [tail call]
> > > > > > > >   return _8;
> > > > > > > >
> > > > > > > > which results in following code-gen:
> > > > > > > > _Z3bazDv2_xS_:
> > > > > > > > .LFB5666:
> > > > > > > >         pcmpeqq %xmm1, %xmm0
> > > > > > > >         pxor    %xmm2, %xmm2
> > > > > > > >         movdqa  .LC0(%rip), %xmm1
> > > > > > > >         pblendvb        %xmm0, .LC1(%rip), %xmm1
> > > > > > > >         pcmpeqq %xmm0, %xmm2
> > > > > > > >         movdqa  %xmm2, %xmm0
> > > > > > > >         jmp     _Z1fDv2_xS_
> > > > > > >
> > > > > > > Ok, thanks for checking.  I think the patch is OK but please let
> > > > > > > Hongtao the chance to comment.
> > > > > > >
> > > > > > > Richard.
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prathamesh
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Richard.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Prathamesh
> > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > >
> > > > > > > > > > > > > Richard.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > BR,
> > > > > > Hongtao
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-12-10 13:00 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-01 10:59 Help with PR97872 Prathamesh Kulkarni
2020-12-01 11:09 ` Richard Biener
2020-12-03  9:06   ` Prathamesh Kulkarni
2020-12-03 11:05     ` Richard Biener
2020-12-04  9:22       ` Prathamesh Kulkarni
2020-12-04 11:48         ` Richard Biener
2020-12-07  5:53           ` Prathamesh Kulkarni
2020-12-07  7:31             ` Richard Biener
2020-12-07  9:33               ` Prathamesh Kulkarni
2020-12-07  9:45                 ` Richard Biener
2020-12-07 10:47                   ` Hongtao Liu
2020-12-07 11:10                     ` Prathamesh Kulkarni
2020-12-07 12:09                       ` Hongtao Liu
2020-12-08  9:06                         ` Prathamesh Kulkarni
2020-12-09 11:47                           ` Prathamesh Kulkarni
2020-12-10  1:30                             ` Hongtao Liu
2020-12-10  7:15                             ` Richard Biener
2020-12-10 13:00                               ` Prathamesh Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).