public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/99520] New: Failure to detect bswap pattern
@ 2021-03-10 15:54 ktkachov at gcc dot gnu.org
2021-03-10 16:03 ` [Bug middle-end/99520] " jakub at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-03-10 15:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
Bug ID: 99520
Summary: Failure to detect bswap pattern
Product: gcc
Version: unknown
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: ktkachov at gcc dot gnu.org
Target Milestone: ---
#include <stdint.h>
uint32_t endian_fix32( uint32_t x ){
return (x<<24) + ((x<<8)&0xff0000) + ((x>>8)&0xff00) + (x>>24);
}
For aarch64 clang manages to optimise it to:
endian_fix32(unsigned int): // @endian_fix32(unsigned int)
rev w0, w0
ret
GCC doesn't:
endian_fix32(unsigned int):
lsl w1, w0, 8
lsr w2, w0, 8
and w2, w2, 65280
lsr w3, w0, 24
and w1, w1, 16711680
add w0, w3, w0, lsl 24
orr w1, w1, w2
add w0, w1, w0
ret
Is there something missing in the bswap pass?
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug middle-end/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
@ 2021-03-10 16:03 ` jakub at gcc dot gnu.org
2021-03-10 16:13 ` jakub at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-10 16:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The bswap pass only detects it with |s instead of +s.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug middle-end/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
2021-03-10 16:03 ` [Bug middle-end/99520] " jakub at gcc dot gnu.org
@ 2021-03-10 16:13 ` jakub at gcc dot gnu.org
2021-03-10 16:17 ` jakub at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-10 16:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Bet handling PLUS_EXPR next to BIT_IOR_EXPR in find_bswap_or_nop_1
and pass_optimize_bswap::execute could do the trick.
Though, I guess for PLUS_EXPR one needs stronger tests in
perform_symbolic_merge.
Because for BIT_IOR_EXPR, it is perfectly fine if the same bytes are ored in
multiple times, but for PLUS_EXPR it obviously is not.
So
masked1 = n1->n & mask;
masked2 = n2->n & mask;
if (masked1 && masked2 && masked1 != masked2)
return NULL;
is good enough for BIT_IOR_EXPR, but PLUS_EXPR would need a
if (masked1 && masked2)
return NULL;
test.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug middle-end/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
2021-03-10 16:03 ` [Bug middle-end/99520] " jakub at gcc dot gnu.org
2021-03-10 16:13 ` jakub at gcc dot gnu.org
@ 2021-03-10 16:17 ` jakub at gcc dot gnu.org
2021-03-10 16:19 ` ktkachov at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-10 16:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Consider e.g.
unsigned foo (unsigned x)
{
return (x<<24) + ((x<<8)&0xff0000) + ((x>>8)&0xff00) + (x>>24) +
(((x&0xff00)<<16)>>8);
}
as example that should not be optimized into __builtin_bswap32 (but should be
with | instead of +).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug middle-end/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
` (2 preceding siblings ...)
2021-03-10 16:17 ` jakub at gcc dot gnu.org
@ 2021-03-10 16:19 ` ktkachov at gcc dot gnu.org
2021-03-10 16:28 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-03-10 16:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
--- Comment #4 from ktkachov at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #3)
> Consider e.g.
> unsigned foo (unsigned x)
> {
> return (x<<24) + ((x<<8)&0xff0000) + ((x>>8)&0xff00) + (x>>24) +
> (((x&0xff00)<<16)>>8);
> }
> as example that should not be optimized into __builtin_bswap32 (but should
> be with | instead of +).
interesting. That said, clang does do a better job than GCC on that too:
foo(unsigned int): // @foo(unsigned int)
lsl w9, w0, #8
rev w8, w0
and w9, w9, #0xff0000
add w0, w9, w8
ret
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug middle-end/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
` (3 preceding siblings ...)
2021-03-10 16:19 ` ktkachov at gcc dot gnu.org
@ 2021-03-10 16:28 ` jakub at gcc dot gnu.org
2021-03-10 17:43 ` ktkachov at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-10 16:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Not sure it is worth bothering except making sure we don't miscompile valid
weird code.
BTW, for testing, one can also mix the |s and +s in the expressions randomly to
get the result.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug middle-end/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
` (4 preceding siblings ...)
2021-03-10 16:28 ` jakub at gcc dot gnu.org
@ 2021-03-10 17:43 ` ktkachov at gcc dot gnu.org
2021-03-11 8:10 ` [Bug tree-optimization/99520] " rguenth at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-03-10 17:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
--- Comment #6 from ktkachov at gcc dot gnu.org ---
FWIW the version in the initial comment is what appears in 525.x264_r in
SPEC2017
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
` (5 preceding siblings ...)
2021-03-10 17:43 ` ktkachov at gcc dot gnu.org
@ 2021-03-11 8:10 ` rguenth at gcc dot gnu.org
2021-11-21 14:46 ` pinskia at gcc dot gnu.org
2021-11-25 20:28 ` roger at nextmovesoftware dot com
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-11 8:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|unknown |11.0
Ever confirmed|0 |1
Last reconfirmed| |2021-03-11
Component|middle-end |tree-optimization
Status|UNCONFIRMED |NEW
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think there was a recent duplicate.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
` (6 preceding siblings ...)
2021-03-11 8:10 ` [Bug tree-optimization/99520] " rguenth at gcc dot gnu.org
@ 2021-11-21 14:46 ` pinskia at gcc dot gnu.org
2021-11-25 20:28 ` roger at nextmovesoftware dot com
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-11-21 14:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #7)
> I think there was a recent duplicate.
Yes I think PR 98953.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/99520] Failure to detect bswap pattern
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
` (7 preceding siblings ...)
2021-11-21 14:46 ` pinskia at gcc dot gnu.org
@ 2021-11-25 20:28 ` roger at nextmovesoftware dot com
8 siblings, 0 replies; 10+ messages in thread
From: roger at nextmovesoftware dot com @ 2021-11-25 20:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |12.0
CC| |roger at nextmovesoftware dot com
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #9 from Roger Sayle <roger at nextmovesoftware dot com> ---
This PR is now fixed on mainline. Thanks to Jakub (my apologies if I'd seen
comment #2 I wouldn't of accidentally broken things; aka PR
tree-optimization/103376, fortunately Jakub was able to quickly correct my
oversight).
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-11-25 20:28 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-10 15:54 [Bug middle-end/99520] New: Failure to detect bswap pattern ktkachov at gcc dot gnu.org
2021-03-10 16:03 ` [Bug middle-end/99520] " jakub at gcc dot gnu.org
2021-03-10 16:13 ` jakub at gcc dot gnu.org
2021-03-10 16:17 ` jakub at gcc dot gnu.org
2021-03-10 16:19 ` ktkachov at gcc dot gnu.org
2021-03-10 16:28 ` jakub at gcc dot gnu.org
2021-03-10 17:43 ` ktkachov at gcc dot gnu.org
2021-03-11 8:10 ` [Bug tree-optimization/99520] " rguenth at gcc dot gnu.org
2021-11-21 14:46 ` pinskia at gcc dot gnu.org
2021-11-25 20:28 ` roger at nextmovesoftware dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).