public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/104360] New: Failure to optimize abs pattern on vector types
@ 2022-02-03  2:29 gabravier at gmail dot com
  2022-02-03  2:45 ` [Bug tree-optimization/104360] Failure to optimize abs pattern (x^(x<0?-1:0)) - (x<0?-1:0) pinskia at gcc dot gnu.org
  2022-02-03  2:54 ` pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: gabravier at gmail dot com @ 2022-02-03  2:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104360

            Bug ID: 104360
           Summary: Failure to optimize abs pattern on vector types
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

#include <stdint.h>

typedef int16_t v8i16 __attribute__((vector_size(16)));

v8i16 abs_i16(v8i16 x)
{
    auto isN = x < v8i16{};

    x ^= isN;
    return x - isN;
}

This (although I think v8i16 could be replaced with any integer vector type and
it still would work) can be optimized to using an abs instruction where
possible (such as `pabsw` on x86-64, or `abs` on aarch64)

PS: this doesn't even necessarily require an abs instruction. on standard
x86-64 with -O3, GCC manages just this:

abs_i16(short __vector(8)):
  pxor xmm1, xmm1
  pcmpgtw xmm1, xmm0
  pxor xmm0, xmm1
  psubw xmm0, xmm1
  ret

whereas LLVM outputs this:

abs_i16(short __vector(8)):
  pxor xmm1, xmm1
  psubw xmm1, xmm0
  pmaxsw xmm0, xmm1
  ret

which I'm pretty sure is better.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/104360] Failure to optimize abs pattern (x^(x<0?-1:0)) - (x<0?-1:0)
  2022-02-03  2:29 [Bug tree-optimization/104360] New: Failure to optimize abs pattern on vector types gabravier at gmail dot com
@ 2022-02-03  2:45 ` pinskia at gcc dot gnu.org
  2022-02-03  2:54 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-03  2:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104360

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement
            Summary|Failure to optimize abs     |Failure to optimize abs
                   |pattern on vector types     |pattern (x^(x<0?-1:0)) -
                   |                            |(x<0?-1:0)
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2022-02-03
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Even the scalar version is not optimized:
typedef short i16;

i16 abs_i16(i16 x)
{
    auto isN = -(x < 0);

    x ^= isN;
    return x - isN;
}

Shouldn't be too hard to optimize for both.

What is funny is clang/LLVM does not catch the scalar version either unless you
do:
(x < i16{}) ? -1 : 0

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/104360] Failure to optimize abs pattern (x^(x<0?-1:0)) - (x<0?-1:0)
  2022-02-03  2:29 [Bug tree-optimization/104360] New: Failure to optimize abs pattern on vector types gabravier at gmail dot com
  2022-02-03  2:45 ` [Bug tree-optimization/104360] Failure to optimize abs pattern (x^(x<0?-1:0)) - (x<0?-1:0) pinskia at gcc dot gnu.org
@ 2022-02-03  2:54 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-03  2:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104360

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note it is easier to detect the vector version of this though:
  isN_3 = x_2(D) < { 0, 0, 0, 0, 0, 0, 0, 0 };
  x_4 = x_2(D) ^ isN_3;
  _5 = x_4 - isN_3;


Pattern here:
(minus @0 (bit_xor:c @0 (lt@1 @0 vertor_zero_p)))

than the scalar version:
  _10 = x_6(D) < 0;
  _11 = (int) _10;
  _12 = -_11;
  _1 = (short int) _12;
  x_7 = _1 ^ x_6(D);
  x.1_2 = (unsigned short) x_7;
  _3 = (unsigned short) _12;
  _4 = x.1_2 - _3;
  _8 = (i16) _4;

Because of the overflow and such.

If we used -fwrapv we get:
  _7 = x_3(D) < 0;
  _8 = (int) _7;
  _9 = -_8;
  _1 = (short int) _9;
  x_4 = _1 ^ x_3(D);
  _5 = x_4 - _1;

Where we could reduce _1 to just:
t = (short int) _7;
_1 = -t;


And then it is just pattern matching.

For int we get:
  _6 = x_2(D) < 0;
  _7 = (int) _6;
  _8 = -_7;
  x_3 = x_2(D) ^ _8;
  _4 = x_3 + _7;
Which should be easy to pattern match.

(plus:c (bit_xor:c @0 (neg (convert@1 (lt @0 zero_p)))) @1)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-03  2:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-03  2:29 [Bug tree-optimization/104360] New: Failure to optimize abs pattern on vector types gabravier at gmail dot com
2022-02-03  2:45 ` [Bug tree-optimization/104360] Failure to optimize abs pattern (x^(x<0?-1:0)) - (x<0?-1:0) pinskia at gcc dot gnu.org
2022-02-03  2:54 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).