public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask
@ 2020-06-26  4:17 gabravier at gmail dot com
  2020-06-26  6:52 ` [Bug tree-optimization/95906] " rguenth at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: gabravier at gmail dot com @ 2020-06-26  4:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

            Bug ID: 95906
           Summary: Failure to recognize max pattern with mask
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

typedef int8_t v16i8 __attribute__((__vector_size__ (16)));

v16i8 f(v16i8 a, v16i8 b)
{
    v16i8 cmp = (a > b);
    return (cmp & a) | (~cmp & b);
}

int f2(int a, int b)
{
    int cmp = -(a > b);
    return (cmp & a) | (~cmp & b);
}

f can be optimized to `__builtin_ia32_pmaxsb128` (on x86 with `-msse4`) (the
`pmax` instructions can be used for the same pattern with similar types) and
`f2` can be optimized to using `MAX_EXPR` (they're essentially the same but
I've included the pattern for vectorized types because I originally found this
in a function (which was made before SSE4) made for SSE). LLVM does these
transformations, but GCC does not.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
@ 2020-06-26  6:52 ` rguenth at gcc dot gnu.org
  2020-06-26 21:23 ` glisse at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-06-26  6:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Keywords|                            |easyhack
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2020-06-26

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
  2020-06-26  6:52 ` [Bug tree-optimization/95906] " rguenth at gcc dot gnu.org
@ 2020-06-26 21:23 ` glisse at gcc dot gnu.org
  2020-08-05 14:47 ` cvs-commit at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-06-26 21:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

--- Comment #1 from Marc Glisse <glisse at gcc dot gnu.org> ---
I'd say generate a (vec_)cond_expr, not directly a max. That is, replace the
comparison with any truth_valued_p (hmm, that function probably stopped working
for vectors when all comparisons were wrapped in vec_cond for avx512).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
  2020-06-26  6:52 ` [Bug tree-optimization/95906] " rguenth at gcc dot gnu.org
  2020-06-26 21:23 ` glisse at gcc dot gnu.org
@ 2020-08-05 14:47 ` cvs-commit at gcc dot gnu.org
  2020-08-05 15:08 ` glisse at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-08-05 14:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Marc Glisse <glisse@gcc.gnu.org>:

https://gcc.gnu.org/g:229752afe3156a3990dacaedb94c76846cebf132

commit r11-2577-g229752afe3156a3990dacaedb94c76846cebf132
Author: Marc Glisse <marc.glisse@inria.fr>
Date:   Wed Aug 5 16:45:33 2020 +0200

    VEC_COND_EXPR optimizations

    When vector comparisons were forced to use vec_cond_expr, we lost a number
of optimizations (my fault for not adding enough testcases to
    prevent that). This patch tries to unwrap vec_cond_expr a bit so some
optimizations can still happen.

    I wasn't planning to add all those transformations together, but adding one
caused a regression, whose fix introduced a second regression,
    etc.

    Restricting to constant folding would not be sufficient, we also need at
least things like X|0 or X&X. The transformations are quite
    conservative with :s and folding only if everything simplifies, we may want
to relax this later. And of course we are going to miss things
    like a?b:c + a?c:b -> b+c.

    In terms of number of operations, some transformations turning 2
VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not look
    like a gain... I expect the bit_not disappears in most cases, and
VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR.

    2020-08-05  Marc Glisse  <marc.glisse@inria.fr>

            PR tree-optimization/95906
            PR target/70314
            * match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e),
            (v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations.
            (op (c ? a : b)): Update to match the new transformations.

            * gcc.dg/tree-ssa/andnot-2.c: New file.
            * gcc.dg/tree-ssa/pr95906.c: Likewise.
            * gcc.target/i386/pr70314.c: Likewise.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2020-08-05 14:47 ` cvs-commit at gcc dot gnu.org
@ 2020-08-05 15:08 ` glisse at gcc dot gnu.org
  2021-07-25  4:37 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-08-05 15:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

--- Comment #3 from Marc Glisse <glisse at gcc dot gnu.org> ---
With the patch (which only affects vectors), f becomes (a>b)?a:b. It should be
easy to add the corresponding transform to MAX_EXPR in match.pd.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2020-08-05 15:08 ` glisse at gcc dot gnu.org
@ 2021-07-25  4:37 ` pinskia at gcc dot gnu.org
  2023-06-09 15:25 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-25  4:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu.org
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2021-07-25  4:37 ` pinskia at gcc dot gnu.org
@ 2023-06-09 15:25 ` pinskia at gcc dot gnu.org
  2023-06-09 20:45 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-09 15:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |pinskia at gcc dot gnu.org

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For the scalar we have:

  _1 = a_5(D) > b_6(D);
  _9 = _2 + -1;
  _4 = b_6(D) & _9;

Where _2 is zero_one_valued_p.

So we could match that:
/* ((m1 CMP m2) + -1) & d -> (m1 CMP m2) ? 0 : d  */
  (simplify
   (bit_and:c (plus (convert (cmp@0 @1 @2)) integer_minus_onep) @3)
   (if (INTEGRAL_TYPE_P (type)
        && INTEGRAL_TYPE_P (TREE_TYPE (@0)))
     (cond @0 { build_zero_cst (type); } @3)))

Like we do already for `(-(m1 CMP m2)) & d -> (m1 CMP m2) ? d : 0`.

This should get us:
  _3 = _1 ? a_5(D) : 0;
  _4 = _1 ? 0 : b_6(D);
  _7 = _3 | _4;

Which then can be reduced to:
_7 = _1 ? a_5(D) : b_6(D)

via (maybe a new pattern):
(simplify
 (bit_ior:c
  (cond @0 @1 integer_zerop)
  (cond @0 integer_zerop @2))
 (cond @0 @1 @2))

Which then will match.
Let me see if I can implement the above.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (5 preceding siblings ...)
  2023-06-09 15:25 ` pinskia at gcc dot gnu.org
@ 2023-06-09 20:45 ` pinskia at gcc dot gnu.org
  2023-06-09 20:48 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-09 20:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 55295
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55295&action=edit
Patch to handle the scalar version

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (6 preceding siblings ...)
  2023-06-09 20:45 ` pinskia at gcc dot gnu.org
@ 2023-06-09 20:48 ` pinskia at gcc dot gnu.org
  2023-08-23  2:43 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-09 20:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #5)
> Created attachment 55295 [details]
> Patch to handle the scalar version

I should we already handle:
```
int f3(int a, int b)
{
    int cmp = -(a > b);
    int cmp1 = -(a <= b);
    return (cmp & a) | (cmp1 & b);
}
```
via the `(a < b ? c : 0) | (a >= b ? d : 0) into a < b ? c : d` pattern in
match.pd already. Adding the `(a?b:0)|(a?0:c) -> (a?b:c)` pattern is needed for
f2 in comment #1 though.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (7 preceding siblings ...)
  2023-06-09 20:48 ` pinskia at gcc dot gnu.org
@ 2023-08-23  2:43 ` pinskia at gcc dot gnu.org
  2023-08-23  2:55 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-23  2:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |110949

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
  _1 = a_5(D) > b_6(D);
  _2 = (int) _1;
  _11 = a_5(D) > b_6(D);
  _3 = _11 ? a_5(D) : 0;
  _9 = _2 + -1;

If we convert _9 to be:
_t = a_5(D) <= b_6(D)
_9 = - _t

Then we would handle it like mentioned in comment #6.

So that is PR 110949 .


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110949
[Bug 110949] ((cast)cmp) - 1 should be tranformed into -(cast)cmp` where cmp`
is the inverse of cmp

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (8 preceding siblings ...)
  2023-08-23  2:43 ` pinskia at gcc dot gnu.org
@ 2023-08-23  2:55 ` pinskia at gcc dot gnu.org
  2023-08-23  5:18 ` pinskia at gcc dot gnu.org
  2023-08-23  5:21 ` pinskia at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-23  2:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906
Bug 95906 depends on bug 54525, which changed state.

Bug 54525 Summary: Recognize (vec_)cond_expr in mask operation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54525

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (9 preceding siblings ...)
  2023-08-23  2:55 ` pinskia at gcc dot gnu.org
@ 2023-08-23  5:18 ` pinskia at gcc dot gnu.org
  2023-08-23  5:21 ` pinskia at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-23  5:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
  cmp_8 = -_2;
  _3 = cmp_8 & a_6(D);
We convert that into:
_2 * a_6(D);

And then have:
  _11 = _2 + -1;
  _5 = b_7(D) & _11;

This could be convert into:
(_11 ^ 1) * b_7(D)

```
(simplify
 (bit_and:c (convert? (plus zero_one_valued_p@0 integer_all_onesp)) @1)
 (if (INTEGRAL_TYPE_P (type)
      && INTEGRAL_TYPE_P (TREE_TYPE (@0))
      && TREE_CODE (TREE_TYPE (@0)) != BOOLEAN_TYPE
      /* Sign extending of the neg or a truncation of the neg
         is needed. */
      && (!TYPE_UNSIGNED (TREE_TYPE (@0))
          || TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@0))))
 (mult (convert (bit_xor @0 { build_one_cst (TREE_TYPE (@0)); })) @1))
```

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/95906] Failure to recognize max pattern with mask
  2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
                   ` (10 preceding siblings ...)
  2023-08-23  5:18 ` pinskia at gcc dot gnu.org
@ 2023-08-23  5:21 ` pinskia at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-23  5:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #8)
>   cmp_8 = -_2;
>   _3 = cmp_8 & a_6(D);
> We convert that into:
> _2 * a_6(D);
> 
> And then have:
>   _11 = _2 + -1;
>   _5 = b_7(D) & _11;
> 
> This could be convert into:
> (_11 ^ 1) * b_7(D)

Sorry `(_2 ^ 1) * b_7(D)` (the patch was ok).

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-08-23  5:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26  4:17 [Bug tree-optimization/95906] New: Failure to recognize max pattern with mask gabravier at gmail dot com
2020-06-26  6:52 ` [Bug tree-optimization/95906] " rguenth at gcc dot gnu.org
2020-06-26 21:23 ` glisse at gcc dot gnu.org
2020-08-05 14:47 ` cvs-commit at gcc dot gnu.org
2020-08-05 15:08 ` glisse at gcc dot gnu.org
2021-07-25  4:37 ` pinskia at gcc dot gnu.org
2023-06-09 15:25 ` pinskia at gcc dot gnu.org
2023-06-09 20:45 ` pinskia at gcc dot gnu.org
2023-06-09 20:48 ` pinskia at gcc dot gnu.org
2023-08-23  2:43 ` pinskia at gcc dot gnu.org
2023-08-23  2:55 ` pinskia at gcc dot gnu.org
2023-08-23  5:18 ` pinskia at gcc dot gnu.org
2023-08-23  5:21 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).