public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3
@ 2020-04-23  5:14 vsevolod.livinskij at frtk dot ru
  2020-04-23  5:35 ` [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0 marxin at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: vsevolod.livinskij at frtk dot ru @ 2020-04-23  5:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

            Bug ID: 94727
           Summary: GCC produces incorrect code with -O3
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vsevolod.livinskij at frtk dot ru
  Target Milestone: ---

GCC produces incorrect code with -O3.

Reproducer:
>$ cat func.cpp 
extern bool var_3;
extern unsigned long long int var_11;
extern unsigned char arr_41 [14] [14] [19] [16] [18] ;
extern long long int arr_42 [14] [14] [19] [16] [18] ;

void test() {
  for (int a = 0; a < 3; a++)
    for (char b = 0; b < 3; b = 4)
      for (int c = 0; c < 8; c++)
        for (short d = 0; d < 5; d += 4)
          for (char e = 0; e < 17; e++)
            arr_42[a][b][c][d][e] = (arr_41[a][b][c][d][e] < var_11) > var_3;
}

>$ cat driver.cpp 
#include <stdio.h>

bool var_3 = (bool)0;
unsigned long long int var_11 = 14035841137156193017ULL;
unsigned char arr_41 [14] [14] [19] [16] [18] ;
long long int arr_42 [14] [14] [19] [16] [18] ;
void test();

int main() {
    for (size_t i_0 = 0; i_0 < 14; ++i_0)
        for (size_t i_1 = 0; i_1 < 14; ++i_1)
            for (size_t i_2 = 0; i_2 < 19; ++i_2)
                for (size_t i_3 = 0; i_3 < 16; ++i_3)
                    for (size_t i_4 = 0; i_4 < 18; ++i_4)
                        arr_41 [i_0] [i_1] [i_2] [i_3] [i_4] = (unsigned
char)226;
    for (size_t i_0 = 0; i_0 < 14; ++i_0)  
        for (size_t i_1 = 0; i_1 < 14; ++i_1)
            for (size_t i_2 = 0; i_2 < 19; ++i_2)
                for (size_t i_3 = 0; i_3 < 16; ++i_3)
                    for (size_t i_4 = 0; i_4 < 18; ++i_4)
                        arr_42 [i_0] [i_1] [i_2] [i_3] [i_4] =
-5577957210778461327LL;
    test();
    printf("%lld\n", arr_42[0][0][0][0][0]);
}

Error:
>$ g++ func.cpp driver.cpp -O0 && ./a.out 
1
>$ g++ func.cpp driver.cpp -O3 && ./a.out 
0

GCC version:
10.0.1 (87841658d4fa5174d1797ee0abc73b3b3f11cad4)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
@ 2020-04-23  5:35 ` marxin at gcc dot gnu.org
  2020-04-23  6:44 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-23  5:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
            Summary|GCC produces incorrect code |[10 Regression] GCC
                   |with -O3                    |produces incorrect code
                   |                            |with -O3 since
                   |                            |r10-5071-g02d895504cc59be0
   Last reconfirmed|                            |2020-04-23
     Ever confirmed|0                           |1
      Known to work|                            |9.3.0
   Target Milestone|---                         |10.0
                 CC|                            |marxin at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org
           Priority|P3                          |P1

--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
Confirmed, started with r10-5071-g02d895504cc59be0.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
  2020-04-23  5:35 ` [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0 marxin at gcc dot gnu.org
@ 2020-04-23  6:44 ` rguenth at gcc dot gnu.org
  2020-04-23  7:02 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-23  6:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
test() is basic-block vectorized on the c-loop body which has the d- and
e-loops
completely unrolled.  We create some weird

  mask__84.17_18 = vect_cst__94 < vect_cst__20;
  vect_patt_266.18_6 = VEC_COND_EXPR <mask__84.17_18, vect_cst__16,
vect_cst__15>;

where vect_cst__94 and vect_cst__20 look like

 { var_3.1_42 ? -1 : 0, var_3.1_42 ? -1 : 0, ...

or even

 { _272 ? -1 : 0, ...

with

 _272 = _271 < var_11.0_40

so the vectorization is quite imperfect, mainly due to

t1.C:9:32: note:   ==> examining statement: _85 = var_11.0_40 > _86;
t1.C:12:69: missed:   not vectorized: relevant stmt not supported: _85 =
var_11.0_40 > _86;
t1.C:9:32: note:   Building vector operands from scalars instead

that doesn't yet explain what goes wrong here.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
  2020-04-23  5:35 ` [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0 marxin at gcc dot gnu.org
  2020-04-23  6:44 ` rguenth at gcc dot gnu.org
@ 2020-04-23  7:02 ` rguenth at gcc dot gnu.org
  2020-04-23  7:34 ` rsandifo at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-23  7:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> test() is basic-block vectorized on the c-loop body which has the d- and
> e-loops
> completely unrolled.  We create some weird
> 
>   mask__84.17_18 = vect_cst__94 < vect_cst__20;
>   vect_patt_266.18_6 = VEC_COND_EXPR <mask__84.17_18, vect_cst__16,
> vect_cst__15>;
> 
> where vect_cst__94 and vect_cst__20 look like
> 
>  { var_3.1_42 ? -1 : 0, var_3.1_42 ? -1 : 0, ...
> 
> or even
> 
>  { _272 ? -1 : 0, ...
> 
> with
> 
>  _272 = _271 < var_11.0_40
> 
> so the vectorization is quite imperfect, mainly due to
> 
> t1.C:9:32: note:   ==> examining statement: _85 = var_11.0_40 > _86;
> t1.C:12:69: missed:   not vectorized: relevant stmt not supported: _85 =
> var_11.0_40 > _86;
> t1.C:9:32: note:   Building vector operands from scalars instead
> 
> that doesn't yet explain what goes wrong here.

If you change the e loop iteration to stop at e < 16 the testcase is not
miscompiled, so somehow the

t1.C:9:20: missed:   Build SLP failed: incompatible vector types for:
arr_42[a_26][0][c_11][0][16] = _44;
t1.C:9:20: note:       old vector type: vector(2) long long int
t1.C:9:20: note:       new vector type: vector(1) long long int

might be a relevant factor.  Leaving to Richard.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
                   ` (2 preceding siblings ...)
  2020-04-23  7:02 ` rguenth at gcc dot gnu.org
@ 2020-04-23  7:34 ` rsandifo at gcc dot gnu.org
  2020-04-23 10:26 ` rsandifo at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-04-23  7:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rsandifo at gcc dot gnu.org

--- Comment #4 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
OK, I'll take a look.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
                   ` (3 preceding siblings ...)
  2020-04-23  7:34 ` rsandifo at gcc dot gnu.org
@ 2020-04-23 10:26 ` rsandifo at gcc dot gnu.org
  2020-04-23 12:35 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-04-23 10:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

--- Comment #5 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Well, this is a bit of mess (surprise).  We have a "<" comparison
between two booleans that are leaves of the SLP tree, so
vectorizable_comparison falls back on:

  /* Invariant comparison.  */
  if (!vectype)
    {
      vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
                                             slp_node);
      if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
        return false;
    }

rhs1 and rhs2 are *unsigned* boolean types, so we get back a vector
of unsigned integers.  All is well, and "<" works as expected without
the need for:

  /* Boolean values may have another representation in vectors
     and therefore we prefer bit operations over comparison for
     them (which also works for scalar masks).  We store opcodes
     to use in bitop1 and bitop2.  Statement is vectorized as
       BITOP2 (rhs1 BITOP1 rhs2) or
       rhs1 BITOP2 (BITOP1 rhs2)
     depending on bitop1 and bitop2 arity.  */
  bool swap_p = false;
  if (VECTOR_BOOLEAN_TYPE_P (vectype))
    {

However, we then defer to vect_get_slp_defs to get the actual operands.
The expected vector type is not part of this interface.  The request
goes to vect_get_constant_vectors, which does:

  if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
      && vect_mask_constant_operand_p (stmt_vinfo))
    vector_type = truth_type_for (stmt_vectype);
  else
    vector_type = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op), op_node);

So the function gives back a vector of mask types, which here are
vectors of *signed* booleans.  This means that "<" gives:

  true (-1) < false (0)

and so the boolean fixup above was needed after all.

I'm going to try:

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 7f3a9fb5fb3..88a1e2c51d2 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -10566,8 +10566,11 @@ vectorizable_comparison (stmt_vec_info stmt_info,
gimple_stmt_iterator *gsi,
   /* Invariant comparison.  */
   if (!vectype)
     {
-      vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
-                                            slp_node);
+      if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
+       vectype = mask_type;
+      else
+       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
+                                              slp_node);
       if (!vectype || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
        return false;
     }

which does at least fix the testcase.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
                   ` (4 preceding siblings ...)
  2020-04-23 10:26 ` rsandifo at gcc dot gnu.org
@ 2020-04-23 12:35 ` rguenth at gcc dot gnu.org
  2020-04-23 13:02 ` rsandifo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-23 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to rsandifo@gcc.gnu.org from comment #5)
> Well, this is a bit of mess (surprise).  We have a "<" comparison
> between two booleans that are leaves of the SLP tree, so
> vectorizable_comparison falls back on:
> 
>   /* Invariant comparison.  */
>   if (!vectype)
>     {
>       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
>                                              slp_node);
>       if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
>         return false;
>     }
> 
> rhs1 and rhs2 are *unsigned* boolean types, so we get back a vector
> of unsigned integers.  All is well, and "<" works as expected without
> the need for:
> 
>   /* Boolean values may have another representation in vectors
>      and therefore we prefer bit operations over comparison for
>      them (which also works for scalar masks).  We store opcodes
>      to use in bitop1 and bitop2.  Statement is vectorized as
>        BITOP2 (rhs1 BITOP1 rhs2) or
>        rhs1 BITOP2 (BITOP1 rhs2)
>      depending on bitop1 and bitop2 arity.  */
>   bool swap_p = false;
>   if (VECTOR_BOOLEAN_TYPE_P (vectype))
>     {
> 
> However, we then defer to vect_get_slp_defs to get the actual operands.
> The expected vector type is not part of this interface.

Ah yeah - sth on my list to fix (not making the type part of that API
but assigning vector types to SLP nodes).  I even have partly completed
"hacks" to do that.  When we have (and use!) vector types on all SLP
nodes we can also get rid of the mismatch code.

> The request
> goes to vect_get_constant_vectors, which does:
> 
>   if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
>       && vect_mask_constant_operand_p (stmt_vinfo))
>     vector_type = truth_type_for (stmt_vectype);
>   else
>     vector_type = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op),
> op_node);
> 
> So the function gives back a vector of mask types, which here are
> vectors of *signed* booleans.  This means that "<" gives:
> 
>   true (-1) < false (0)
> 
> and so the boolean fixup above was needed after all.
> 
> I'm going to try:
> 
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 7f3a9fb5fb3..88a1e2c51d2 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -10566,8 +10566,11 @@ vectorizable_comparison (stmt_vec_info stmt_info,
> gimple_stmt_iterator *gsi,
>    /* Invariant comparison.  */
>    if (!vectype)
>      {
> -      vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
> -                                            slp_node);
> +      if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
> +       vectype = mask_type;
> +      else
> +       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
> +                                              slp_node);
>        if (!vectype || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
>         return false;
>      }
> 
> which does at least fix the testcase.

LGTM.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
                   ` (5 preceding siblings ...)
  2020-04-23 12:35 ` rguenth at gcc dot gnu.org
@ 2020-04-23 13:02 ` rsandifo at gcc dot gnu.org
  2020-04-23 14:45 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-04-23 13:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

--- Comment #7 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #6)
> (In reply to rsandifo@gcc.gnu.org from comment #5)
> > 
> > However, we then defer to vect_get_slp_defs to get the actual operands.
> > The expected vector type is not part of this interface.
> 
> Ah yeah - sth on my list to fix (not making the type part of that API
> but assigning vector types to SLP nodes).  I even have partly completed
> "hacks" to do that.  When we have (and use!) vector types on all SLP
> nodes we can also get rid of the mismatch code.

Sounds great!  The fewer decisions we make on the fly the better...

> > I'm going to try:
> > 
> > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> > index 7f3a9fb5fb3..88a1e2c51d2 100644
> > --- a/gcc/tree-vect-stmts.c
> > +++ b/gcc/tree-vect-stmts.c
> > @@ -10566,8 +10566,11 @@ vectorizable_comparison (stmt_vec_info stmt_info,
> > gimple_stmt_iterator *gsi,
> >    /* Invariant comparison.  */
> >    if (!vectype)
> >      {
> > -      vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
> > -                                            slp_node);
> > +      if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
> > +       vectype = mask_type;
> > +      else
> > +       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
> > +                                              slp_node);
> >        if (!vectype || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
> >         return false;
> >      }
> > 
> > which does at least fix the testcase.
> 
> LGTM.

Thanks.  aarch64-linux-gnu and x86_64-linux-gnu bootstrapped passed,
now trying an SVE test run.  Will commit if that passes too.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
                   ` (6 preceding siblings ...)
  2020-04-23 13:02 ` rsandifo at gcc dot gnu.org
@ 2020-04-23 14:45 ` cvs-commit at gcc dot gnu.org
  2020-04-23 14:48 ` rsandifo at gcc dot gnu.org
  2020-04-28  7:04 ` cvs-commit at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-04-23 14:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:901f5289d9465d4c388ae288f850ad4f29e99a2c

commit r10-7915-g901f5289d9465d4c388ae288f850ad4f29e99a2c
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Thu Apr 23 15:45:43 2020 +0100

    vect: Fix comparisons between invariant booleans [PR94727]

    This PR was caused by mismatched expectations between
    vectorizable_comparison and SLP.  We had a "<" comparison
    between two booleans that were leaves of the SLP tree, so
    vectorizable_comparison fell back on:

      /* Invariant comparison.  */
      if (!vectype)
        {
          vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
                                                 slp_node);
          if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
            return false;
        }

    rhs1 and rhs2 were *unsigned* boolean types, so we got back a vector
    of unsigned integers.  This in itself was OK, and meant that "<"
    worked as expected without the need for the boolean fix-ups:

      /* Boolean values may have another representation in vectors
         and therefore we prefer bit operations over comparison for
         them (which also works for scalar masks).  We store opcodes
         to use in bitop1 and bitop2.  Statement is vectorized as
           BITOP2 (rhs1 BITOP1 rhs2) or
           rhs1 BITOP2 (BITOP1 rhs2)
         depending on bitop1 and bitop2 arity.  */
      bool swap_p = false;
      if (VECTOR_BOOLEAN_TYPE_P (vectype))
        {

    However, vectorizable_comparison then used vect_get_slp_defs to get
    the actual operands.  The request went to vect_get_constant_vectors,
    which also has logic to calculate the vector type.  The problem was
    that this type was different from the one chosen above:

      if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
          && vect_mask_constant_operand_p (stmt_vinfo))
        vector_type = truth_type_for (stmt_vectype);
      else
        vector_type = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op),
op_node);

    So the function gave back a vector of mask types, which here are vectors
    of *signed* booleans.  This meant that "<" gave:

      true (-1) < false (0)

    and so the boolean fixup above was needed after all.

    Fixed by making vectorizable_comparison also pick a mask type in
    this case.

    2020-04-23  Richard Sandiford  <richard.sandiford@arm.com>

    gcc/
            PR tree-optimization/94727
            * tree-vect-stmts.c (vectorizable_comparison): Use mask_type when
            comparing invariant scalar booleans.

    gcc/testsuite/
            PR tree-optimization/94727
            * gcc.dg/vect/pr94727.c: New test.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
                   ` (7 preceding siblings ...)
  2020-04-23 14:45 ` cvs-commit at gcc dot gnu.org
@ 2020-04-23 14:48 ` rsandifo at gcc dot gnu.org
  2020-04-28  7:04 ` cvs-commit at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-04-23 14:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #9 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Fixed on master.  Thanks for the report.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0
  2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
                   ` (8 preceding siblings ...)
  2020-04-23 14:48 ` rsandifo at gcc dot gnu.org
@ 2020-04-28  7:04 ` cvs-commit at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-04-28  7:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94727

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:e62a820d686d1fa97a9eefdc65ca07d8f96ac9f4

commit r10-8006-ge62a820d686d1fa97a9eefdc65ca07d8f96ac9f4
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Tue Apr 28 08:04:29 2020 +0100

    vect: Fix COND_EXPRs involving variant booleans [PR94727]

    The previous patch for this PR handled separate comparisons.
    However, as arm targets show, the same fix is needed when
    handling comparisons embedded in a VEC_COND_EXPR.

    Here too, the problem is that vect_get_constant_vectors will
    calculate its own vector type, using truth_type_for on the
    STMT_VINFO_VECTYPE, and the vectoriable_* routines need to be
    consistent with that.

    2020-04-28  Richard Sandiford  <richard.sandiford@arm.com>

    gcc/
            PR tree-optimization/94727
            * tree-vect-stmts.c (vect_is_simple_cond): If both comparison
            operands are invariant booleans, use the mask type associated with
the
            STMT_VINFO_VECTYPE.  Use !slp_node instead of !vectype to exclude
SLP.
            (vectorizable_condition): Pass vectype unconditionally to
            vect_is_simple_cond.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-28  7:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-23  5:14 [Bug tree-optimization/94727] New: GCC produces incorrect code with -O3 vsevolod.livinskij at frtk dot ru
2020-04-23  5:35 ` [Bug tree-optimization/94727] [10 Regression] GCC produces incorrect code with -O3 since r10-5071-g02d895504cc59be0 marxin at gcc dot gnu.org
2020-04-23  6:44 ` rguenth at gcc dot gnu.org
2020-04-23  7:02 ` rguenth at gcc dot gnu.org
2020-04-23  7:34 ` rsandifo at gcc dot gnu.org
2020-04-23 10:26 ` rsandifo at gcc dot gnu.org
2020-04-23 12:35 ` rguenth at gcc dot gnu.org
2020-04-23 13:02 ` rsandifo at gcc dot gnu.org
2020-04-23 14:45 ` cvs-commit at gcc dot gnu.org
2020-04-23 14:48 ` rsandifo at gcc dot gnu.org
2020-04-28  7:04 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).