public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/105490] New: unvectorized loop due to bool condition loaded from memory and different size data
@ 2022-05-05  8:39 rguenth at gcc dot gnu.org
  2022-05-05  8:39 ` [Bug tree-optimization/105490] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-05  8:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105490

            Bug ID: 105490
           Summary: unvectorized loop due to bool condition loaded from
                    memory and different size data
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

Cloned from PR104595

#define N 256
typedef short T;
extern T a[N];
extern T b[N];
extern T c[N];
extern _Bool pb[N];

void predicate_by_bool()
{
  for (int i = 0; i < N; i++)
    c[i] = pb[i] ? a[i] : b[i];
}

where we expect vect_recog_mask_conversion_pattern to trigger here.  That
case can be fixed with

@@ -4658,9 +4660,9 @@ vect_recog_mask_conversion_pattern (vec_info *vinfo,

       if (TREE_CODE (rhs1) == SSA_NAME)
        {
-         rhs1_type = integer_type_for_mask (rhs1, vinfo);
-         if (!rhs1_type)
+         if (integer_type_for_mask (rhs1, vinfo))
            return NULL;
+         rhs1_type = TREE_TYPE (rhs1);
        }
       else if (COMPARISON_CLASS_P (rhs1))
        {

but we then run into the original issue again:

t.c:10:6: missed:   not vectorized: relevant stmt not supported: patt_28 =
(<signed-boolean:16>) _1;

The cruical difference between working and not working is the _1 != 0 ?: vs.
_1 ?:  - I think we do have a duplicate bugreport here, possibly involving
combinations of different from memory bools.

Trying to make bool pattern recog inserting the relevant compensation code
is really iffy, the mask conversion pattern confuses things here - what
we are missing seems really be transforming the leafs.

Note that we do not try to patter-recog a pattern thus we cannot at the moment
have both, vect_recog_bool_pattern and vect_recog_mask_conversion_pattern
at the same time on the ?: stmt.

Note IIRC vect_recog_bool_pattern has to come last but it's now after
vect_recog_mask_conversion_pattern.  Unfortunately swapping things does
not make vect_recog_bool_pattern recognize and fixup

  patt_24 = (<signed-boolean:16>) _1;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/105490] unvectorized loop due to bool condition loaded from memory and different size data
  2022-05-05  8:39 [Bug tree-optimization/105490] New: unvectorized loop due to bool condition loaded from memory and different size data rguenth at gcc dot gnu.org
@ 2022-05-05  8:39 ` rguenth at gcc dot gnu.org
  2022-10-31 15:46 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-05  8:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105490

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-05-05
             Blocks|                            |53947
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/105490] unvectorized loop due to bool condition loaded from memory and different size data
  2022-05-05  8:39 [Bug tree-optimization/105490] New: unvectorized loop due to bool condition loaded from memory and different size data rguenth at gcc dot gnu.org
  2022-05-05  8:39 ` [Bug tree-optimization/105490] " rguenth at gcc dot gnu.org
@ 2022-10-31 15:46 ` pinskia at gcc dot gnu.org
  2023-08-30  6:07 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-10-31 15:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105490

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/105490] unvectorized loop due to bool condition loaded from memory and different size data
  2022-05-05  8:39 [Bug tree-optimization/105490] New: unvectorized loop due to bool condition loaded from memory and different size data rguenth at gcc dot gnu.org
  2022-05-05  8:39 ` [Bug tree-optimization/105490] " rguenth at gcc dot gnu.org
  2022-10-31 15:46 ` pinskia at gcc dot gnu.org
@ 2023-08-30  6:07 ` pinskia at gcc dot gnu.org
  2023-08-30 15:32 ` pinskia at gcc dot gnu.org
  2023-08-31  6:34 ` rguenther at suse dot de
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-30  6:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105490

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=111149

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So what is interesting is we do handle:
```
#define N 256
typedef short T;
extern T a[N];
extern T b[N];
extern T c[N];
extern _Bool pb[N];
extern _Bool pb1[N];

void predicate_by_bool_ne()
{
  for (int i = 0; i < N; i++)
    c[i] = pb[i] != pb1[i] ? a[i] : b[i];
}
```

But not:
```
#define N 256
typedef short T;
extern T a[N];
extern T b[N];
extern T c[N];
extern _Bool pb[N];
extern _Bool pb1[N];

void predicate_by_bool_and()
{
  for (int i = 0; i < N; i++)
    c[i] = (pb[i] & pb1[i]) ? a[i] : b[i];
}
```

And If I change the canonical form of `bool != bool` into `bool ^ bool` things
break down in a similar way. I tried to look into a reasonible way of handling
this in the vectorizer but I could not figure out how to treat `^` in the same
way as `!=`.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/105490] unvectorized loop due to bool condition loaded from memory and different size data
  2022-05-05  8:39 [Bug tree-optimization/105490] New: unvectorized loop due to bool condition loaded from memory and different size data rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-08-30  6:07 ` pinskia at gcc dot gnu.org
@ 2023-08-30 15:32 ` pinskia at gcc dot gnu.org
  2023-08-31  6:34 ` rguenther at suse dot de
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-30 15:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105490

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is an even odder case:
```
#define N 256
typedef short T;
extern T a[N];
extern T b[N];
extern T c[N];
extern _Bool pb[N];
extern _Bool pb1[N];
extern _Bool pb2[N];

void predicate_by_booland()
{
  for (int i = 0; i < N; i++)
    c[i] = ((pb1[i] != pb[i]) != pb2[i]) ? a[i] : b[i];
}
```
This vectorizes currently with `-O3` but not with `-O3 -fno-tree-vrp`.
IR with -fno-tree-vrp:
```
  _1 = pb1[i_15];
  _2 = pb[i_15];
  _3 = _1 != _2;
  _4 = pb2[i_15];
  iftmp.0_10 = a[i_15];
  _5 = _3 != _4;
  iftmp.0_9 = b[i_15];
```
IR without (VRP turned on):
```
  _1 = pb1[i_15];
  _2 = pb[i_15];
  _3 = _1 ^ _2;
  _4 = pb2[i_15];
  iftmp.0_10 = a[i_15];
  _5 = _3 != _4;
  iftmp.0_9 = b[i_15];
```

So it is even more confusing ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/105490] unvectorized loop due to bool condition loaded from memory and different size data
  2022-05-05  8:39 [Bug tree-optimization/105490] New: unvectorized loop due to bool condition loaded from memory and different size data rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-08-30 15:32 ` pinskia at gcc dot gnu.org
@ 2023-08-31  6:34 ` rguenther at suse dot de
  4 siblings, 0 replies; 6+ messages in thread
From: rguenther at suse dot de @ 2023-08-31  6:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105490

--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 30 Aug 2023, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105490
> 
> --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> Here is an even odder case:
> ```
> #define N 256
> typedef short T;
> extern T a[N];
> extern T b[N];
> extern T c[N];
> extern _Bool pb[N];
> extern _Bool pb1[N];
> extern _Bool pb2[N];
> 
> void predicate_by_booland()
> {
>   for (int i = 0; i < N; i++)
>     c[i] = ((pb1[i] != pb[i]) != pb2[i]) ? a[i] : b[i];
> }
> ```
> This vectorizes currently with `-O3` but not with `-O3 -fno-tree-vrp`.
> IR with -fno-tree-vrp:
> ```
>   _1 = pb1[i_15];
>   _2 = pb[i_15];
>   _3 = _1 != _2;
>   _4 = pb2[i_15];
>   iftmp.0_10 = a[i_15];
>   _5 = _3 != _4;
>   iftmp.0_9 = b[i_15];
> ```
> IR without (VRP turned on):
> ```
>   _1 = pb1[i_15];
>   _2 = pb[i_15];
>   _3 = _1 ^ _2;
>   _4 = pb2[i_15];
>   iftmp.0_10 = a[i_15];
>   _5 = _3 != _4;
>   iftmp.0_9 = b[i_15];
> ```
> 
> So it is even more confusing ...

This is usually due to limits/bugs in the vectorizer bool pattern
recognition code which is supposed to replace "mask" uses of
bool with x ? -1 : 0 (or make sure comparisons produce them)
and data uses with x ? 1 : 0 (or make sure "data" stmts produce them).
There are defenses in vectorizable_* to detect cases where that did
go wrong (which it does sometimes), leading to missed vectorizations
(or wrong code in the worst case).

It's a mess.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-31  6:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-05  8:39 [Bug tree-optimization/105490] New: unvectorized loop due to bool condition loaded from memory and different size data rguenth at gcc dot gnu.org
2022-05-05  8:39 ` [Bug tree-optimization/105490] " rguenth at gcc dot gnu.org
2022-10-31 15:46 ` pinskia at gcc dot gnu.org
2023-08-30  6:07 ` pinskia at gcc dot gnu.org
2023-08-30 15:32 ` pinskia at gcc dot gnu.org
2023-08-31  6:34 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).