public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/96912] New: Failure to optimize pattern
@ 2020-09-03  7:31 gabravier at gmail dot com
  2020-09-03  8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: gabravier at gmail dot com @ 2020-09-03  7:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912

            Bug ID: 96912
           Summary: Failure to optimize pattern
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

typedef char v16i8 __attribute__((vector_size(16)));
typedef int64_t v2i64 __attribute__((vector_size(16)));

v2i64 blend_epi8(v2i64 x, v2i64 y, v16i8 mask)
{
    v2i64 tmp = (mask < 0);
    return (~tmp & x) | (tmp & y);
}

This can be optimized to `return (v2i64)__builtin_ia32_pblendvb128((v16i8)x,
(v16i8)y, mask);`. This transformation is done by LLVM, but not by GCC.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern
  2020-09-03  7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com
@ 2020-09-03  8:03 ` rguenth at gcc dot gnu.org
  2020-09-03  8:03 ` glisse at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-03  8:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-09-03
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
On GIMPLE this could be pattern-matched to a VEC_PERM_EXPR <> which is
how to represent blends.  Of course matching this on RTL is possible as well.

The difficulty is of course the mismatch in element sizes where the
mask appears to be byte granular while the data to be blended is double-words.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern
  2020-09-03  7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com
  2020-09-03  8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org
@ 2020-09-03  8:03 ` glisse at gcc dot gnu.org
  2020-11-24 18:07 ` jakub at gcc dot gnu.org
  2021-08-15  0:24 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-09-03  8:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912

--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
With consistent types, we recognize a VEC_COND_EXPR. With inconsistent types, I
guess we would need to reinterpret x and y as v16i8, and reinterpret the result
back to v2i64.

(please keep #include <stdint.h> in your testcases so we can just copy-paste
and compile them, or use long long instead of int64_t)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern
  2020-09-03  7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com
  2020-09-03  8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org
  2020-09-03  8:03 ` glisse at gcc dot gnu.org
@ 2020-11-24 18:07 ` jakub at gcc dot gnu.org
  2021-08-15  0:24 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-11-24 18:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, in:
typedef char V __attribute__((vector_size(16)));
typedef long long W __attribute__((vector_size(16)));

W
foo (W x, W y, V m)
{
  W t = (m < 0);
  return (~t & x) | (t & y);
}

V
bar (V x, V y, V m)
{
  V t = (m < 0);
  return (~t & x) | (t & y);
}

we actually optimize bar the way we should, seems it is forwprop1 that turns
  _1 = m_5(D) < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
  t_6 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>;
  _2 = ~t_6;
  _3 = x_7(D) & _2;
  _4 = t_6 & y_8(D);
  _9 = _3 | _4;
  return _9;
into:
  _1 = m_5(D) < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
  t_6 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>;
  _2 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, {
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }>;
  _3 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
x_7(D)>;
  _4 = VEC_COND_EXPR <_1, y_8(D), { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0 }>;
  _9 = VEC_COND_EXPR <_1, y_8(D), x_7(D)>;
  return _9;
but the similar:
  _1 = m_6(D) < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
  _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>;
  t_7 = VIEW_CONVERT_EXPR<W>(_2);
  _3 = ~t_7;
  _4 = x_8(D) & _3;
  _5 = t_7 & y_9(D);
  _10 = _4 | _5;
  return _10;
in foo isn't optimized similarly.  I'll look tomorrow at that, we should handle
it likee bar with the VEC_COND_EXPR being done in the vector type corresponding
to the comparison with VCEs to that and back.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern
  2020-09-03  7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2020-11-24 18:07 ` jakub at gcc dot gnu.org
@ 2021-08-15  0:24 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-15  0:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2020-09-03 00:00:00         |2021-8-14
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-15  0:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03  7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com
2020-09-03  8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org
2020-09-03  8:03 ` glisse at gcc dot gnu.org
2020-11-24 18:07 ` jakub at gcc dot gnu.org
2021-08-15  0:24 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).