public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern
@ 2020-07-18 0:39 gabravier at gmail dot com
2020-07-20 6:15 ` [Bug tree-optimization/96239] " rguenth at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: gabravier at gmail dot com @ 2020-07-18 0:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96239
Bug ID: 96239
Summary: Failure to recognize __builtin_bswap16 pattern
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
union BytesOverlay
{
uint8_t u8[2];
uint16_t u16;
};
uint16_t f(uint16_t from)
{
BytesOverlay overlay;
overlay.u16 = from;
uint8_t byte1 = overlay.u8[0];
uint8_t byte2 = overlay.u8[1];
overlay.u8[0] = byte2;
overlay.u8[1] = byte1;
return overlay.u16;
}
This can be transformed to `return __builtin_bswap16(from);`. This
transformation is done by LLVM, but not by GCC.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/96239] Failure to recognize __builtin_bswap16 pattern
2020-07-18 0:39 [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern gabravier at gmail dot com
@ 2020-07-20 6:15 ` rguenth at gcc dot gnu.org
2020-12-14 14:41 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-07-20 6:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96239
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2020-07-20
Blocks| |53947
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I see at -O2
Coalescing successful!
Merged into 1 stores
16 bit bswap implementation found at: _9
New sequence of 1 stores to replace old one of 2 stores
_9 = from_2(D) r>> 8;
MEM[(union BytesOverlay *)&overlay] = _9;
and
movl %edi, %eax
rolw $8, %ax
ret
with -O3 the vectorizer gets in the way and elides overlay:
_7 = (unsigned char) from_2(D);
_8 = BIT_FIELD_REF <from_2(D), 8, 8>;
_9 = {_8, _7};
_3 = VIEW_CONVERT_EXPR<short unsigned int>(_9);
overlay ={v} {CLOBBER};
return _3;
now we could fold the V_C_E of the vector ctor to a bswap on from_2 though
that would be quite a big special pattern. Maybe vector-ctor folding can
consider this case to convert it to a V_C_E to a vector type from the
bswap result. OTOH "fixing" the vectorizer to emit a bswap would be
even nicer.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/96239] Failure to recognize __builtin_bswap16 pattern
2020-07-18 0:39 [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern gabravier at gmail dot com
2020-07-20 6:15 ` [Bug tree-optimization/96239] " rguenth at gcc dot gnu.org
@ 2020-12-14 14:41 ` jakub at gcc dot gnu.org
2020-12-14 19:54 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-14 14:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96239
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Doing it in the vectorizer wouldn't optimize it when users write it by hand,
like:
typedef char V __attribute__((vector_size (2)));
typedef char W __attribute__((vector_size (8)));
V
foo (unsigned short x)
{
return (V) { x >> 8, x };
}
W
bar (unsigned long long x)
{
return (W) { x >> 56, x >> 48, x >> 40, x >> 32, x >> 24, x >> 16, x >> 8, x
};
}
and forwprop doesn't have the infrastructure needed for this.
Wouldn't the bswap pass be the right spot to handle this?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/96239] Failure to recognize __builtin_bswap16 pattern
2020-07-18 0:39 [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern gabravier at gmail dot com
2020-07-20 6:15 ` [Bug tree-optimization/96239] " rguenth at gcc dot gnu.org
2020-12-14 14:41 ` jakub at gcc dot gnu.org
@ 2020-12-14 19:54 ` jakub at gcc dot gnu.org
2020-12-16 12:11 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-14 19:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96239
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 49766
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49766&action=edit
gcc11-pr96239.patch
So, I wrote this untested patch which fixes my testcases, but doesn't fix due
to pass ordering issue the original one, as the bswap pass is done before
vectorization only (both loop and slp).
I guess some of the options are schedule another bswap pass instance later, or
include just the vector CONSTRUCTOR handling in e.g. the store-merging pass
(yes, it would be related just because the bswap code lives in the
store-merging source file and performs also bswap optimizations upon stores).
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/96239] Failure to recognize __builtin_bswap16 pattern
2020-07-18 0:39 [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern gabravier at gmail dot com
` (2 preceding siblings ...)
2020-12-14 19:54 ` jakub at gcc dot gnu.org
@ 2020-12-16 12:11 ` cvs-commit at gcc dot gnu.org
2021-01-05 15:17 ` cvs-commit at gcc dot gnu.org
2021-01-05 15:24 ` jakub at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-12-16 12:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96239
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:cd676dfa57e643a4f7d8445e6ebad0f21cf3fd84
commit r11-6115-gcd676dfa57e643a4f7d8445e6ebad0f21cf3fd84
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Dec 16 13:08:07 2020 +0100
bswap: Handle vector CONSTRUCTORs [PR96239]
The following patch teaches the bswap pass to handle for small (2/4/8 byte
long) vectors a CONSTRUCTOR by determining if the bytes of the constructor
come from non-vector sources and are either nop or bswap and changing the
CONSTRUCTOR in that case to VIEW_CONVERT_EXPR from scalar integer to
the vector type.
Unfortunately, as I found after the patch was written, due to pass ordering
this doesn't really fix the original testcase, just the one I wrote,
because both loop and slp vectorization is done only after the bswap pass.
A possible way out of that would be to perform just this particular bswap
optimization (i.e. for CONSTRUCTOR assignments with integral vector types
call find_bswap_or_nop and bswap_replace if successful) also during the
store merging pass, it isn't really a store, but the store merging pass
already performs bswapping when handling store, so it wouldn't be that big
hack. What do you think?
2020-12-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96239
* gimple-ssa-store-merging.c (find_bswap_or_nop): Handle a vector
CONSTRUCTOR.
(bswap_replace): Likewise.
* gcc.dg/pr96239.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/96239] Failure to recognize __builtin_bswap16 pattern
2020-07-18 0:39 [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern gabravier at gmail dot com
` (3 preceding siblings ...)
2020-12-16 12:11 ` cvs-commit at gcc dot gnu.org
@ 2021-01-05 15:17 ` cvs-commit at gcc dot gnu.org
2021-01-05 15:24 ` jakub at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-05 15:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96239
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:a7553ad60bebc419d510564b8b9f9e5e03725ff5
commit r11-6470-ga7553ad60bebc419d510564b8b9f9e5e03725ff5
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Jan 5 16:16:06 2021 +0100
store-merging: Handle vector CONSTRUCTORs using bswap [PR96239]
I've tried to add such helper, but handling over just analysis and letting
each pass handle it differently seems complicated given the limitations of
the bswap infrastructure.
So, this patch just hooks the optimization also into store-merging so that
the original testcase from the PR can be fixed.
2021-01-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96239
* gimple-ssa-store-merging.c (maybe_optimize_vector_constructor):
New
function.
(get_status_for_store_merging): Don't return BB_INVALID for blocks
with potential bswap optimizable CONSTRUCTORs.
(pass_store_merging::execute): Optimize vector CONSTRUCTORs with
bswap
if possible.
* gcc.dg/tree-ssa/pr96239.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/96239] Failure to recognize __builtin_bswap16 pattern
2020-07-18 0:39 [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern gabravier at gmail dot com
` (4 preceding siblings ...)
2021-01-05 15:17 ` cvs-commit at gcc dot gnu.org
@ 2021-01-05 15:24 ` jakub at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-05 15:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96239
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed now.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-01-05 15:24 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-18 0:39 [Bug tree-optimization/96239] New: Failure to recognize __builtin_bswap16 pattern gabravier at gmail dot com
2020-07-20 6:15 ` [Bug tree-optimization/96239] " rguenth at gcc dot gnu.org
2020-12-14 14:41 ` jakub at gcc dot gnu.org
2020-12-14 19:54 ` jakub at gcc dot gnu.org
2020-12-16 12:11 ` cvs-commit at gcc dot gnu.org
2021-01-05 15:17 ` cvs-commit at gcc dot gnu.org
2021-01-05 15:24 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).