public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/96912] New: Failure to optimize pattern @ 2020-09-03 7:31 gabravier at gmail dot com 2020-09-03 8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org ` (3 more replies) 0 siblings, 4 replies; 5+ messages in thread From: gabravier at gmail dot com @ 2020-09-03 7:31 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912 Bug ID: 96912 Summary: Failure to optimize pattern Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- typedef char v16i8 __attribute__((vector_size(16))); typedef int64_t v2i64 __attribute__((vector_size(16))); v2i64 blend_epi8(v2i64 x, v2i64 y, v16i8 mask) { v2i64 tmp = (mask < 0); return (~tmp & x) | (tmp & y); } This can be optimized to `return (v2i64)__builtin_ia32_pblendvb128((v16i8)x, (v16i8)y, mask);`. This transformation is done by LLVM, but not by GCC. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern 2020-09-03 7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com @ 2020-09-03 8:03 ` rguenth at gcc dot gnu.org 2020-09-03 8:03 ` glisse at gcc dot gnu.org ` (2 subsequent siblings) 3 siblings, 0 replies; 5+ messages in thread From: rguenth at gcc dot gnu.org @ 2020-09-03 8:03 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2020-09-03 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- On GIMPLE this could be pattern-matched to a VEC_PERM_EXPR <> which is how to represent blends. Of course matching this on RTL is possible as well. The difficulty is of course the mismatch in element sizes where the mask appears to be byte granular while the data to be blended is double-words. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern 2020-09-03 7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com 2020-09-03 8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org @ 2020-09-03 8:03 ` glisse at gcc dot gnu.org 2020-11-24 18:07 ` jakub at gcc dot gnu.org 2021-08-15 0:24 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: glisse at gcc dot gnu.org @ 2020-09-03 8:03 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912 --- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> --- With consistent types, we recognize a VEC_COND_EXPR. With inconsistent types, I guess we would need to reinterpret x and y as v16i8, and reinterpret the result back to v2i64. (please keep #include <stdint.h> in your testcases so we can just copy-paste and compile them, or use long long instead of int64_t) ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern 2020-09-03 7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com 2020-09-03 8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org 2020-09-03 8:03 ` glisse at gcc dot gnu.org @ 2020-11-24 18:07 ` jakub at gcc dot gnu.org 2021-08-15 0:24 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: jakub at gcc dot gnu.org @ 2020-11-24 18:07 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Note, in: typedef char V __attribute__((vector_size(16))); typedef long long W __attribute__((vector_size(16))); W foo (W x, W y, V m) { W t = (m < 0); return (~t & x) | (t & y); } V bar (V x, V y, V m) { V t = (m < 0); return (~t & x) | (t & y); } we actually optimize bar the way we should, seems it is forwprop1 that turns _1 = m_5(D) < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; t_6 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; _2 = ~t_6; _3 = x_7(D) & _2; _4 = t_6 & y_8(D); _9 = _3 | _4; return _9; into: _1 = m_5(D) < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; t_6 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; _2 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }>; _3 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, x_7(D)>; _4 = VEC_COND_EXPR <_1, y_8(D), { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; _9 = VEC_COND_EXPR <_1, y_8(D), x_7(D)>; return _9; but the similar: _1 = m_6(D) < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; t_7 = VIEW_CONVERT_EXPR<W>(_2); _3 = ~t_7; _4 = x_8(D) & _3; _5 = t_7 & y_9(D); _10 = _4 | _5; return _10; in foo isn't optimized similarly. I'll look tomorrow at that, we should handle it likee bar with the VEC_COND_EXPR being done in the vector type corresponding to the comparison with VCEs to that and back. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/96912] Failure to optimize pblendvb pattern 2020-09-03 7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com ` (2 preceding siblings ...) 2020-11-24 18:07 ` jakub at gcc dot gnu.org @ 2021-08-15 0:24 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2021-08-15 0:24 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|2020-09-03 00:00:00 |2021-8-14 Severity|normal |enhancement ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-15 0:24 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-03 7:31 [Bug tree-optimization/96912] New: Failure to optimize pattern gabravier at gmail dot com 2020-09-03 8:03 ` [Bug tree-optimization/96912] Failure to optimize pblendvb pattern rguenth at gcc dot gnu.org 2020-09-03 8:03 ` glisse at gcc dot gnu.org 2020-11-24 18:07 ` jakub at gcc dot gnu.org 2021-08-15 0:24 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).