public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102391] New: Failure to optimize 2 8-bit loads into a single 16-bit load
@ 2021-09-17 22:32 gabravier at gmail dot com
2021-09-18 0:03 ` [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load gabravier at gmail dot com
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: gabravier at gmail dot com @ 2021-09-17 22:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102391
Bug ID: 102391
Summary: Failure to optimize 2 8-bit loads into a single 16-bit
load
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
#include <stdint.h>
uint16_t HeaderReadU16LE(int offset, uint8_t *RomHeader)
{
return RomHeader[offset] |
(RomHeader[offset + 1] << 8);
}
This can be optimized into a single 16-bit load. On -O3, this optimization is
done by LLVM, but not by GCC.
This winds up affecting the resulting assembly quite a bit:
AMD64 GCC:
HeaderReadU16LE:
movsx rdi, edi
movzx edx, BYTE PTR [rsi+1+rdi]
movzx eax, BYTE PTR [rsi+rdi]
sal edx, 8
or eax, edx
ret
AMD64 LLVM:
HeaderReadU16LE:
movsxd rax, edi
movzx eax, word ptr [rsi + rax]
ret
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load
2021-09-17 22:32 [Bug tree-optimization/102391] New: Failure to optimize 2 8-bit loads into a single 16-bit load gabravier at gmail dot com
@ 2021-09-18 0:03 ` gabravier at gmail dot com
2021-09-18 0:14 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: gabravier at gmail dot com @ 2021-09-18 0:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102391
Gabriel Ravier <gabravier at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|Failure to optimize 2 8-bit |Failure to optimize
|loads into a single 16-bit |adjacent 8-bit loads into a
|load |single bigger load
--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> ---
Note: this also equivalently works on bigger sizes:
uint32_t HeaderReadU32LE(int offset, uint8_t *RomHeader)
{
return RomHeader[offset] |
(RomHeader[offset + 1] << 8) |
(RomHeader[offset + 2] << 16) |
(RomHeader[offset + 3] << 24);
}
On AMD64, GCC outputs this:
HeaderReadU32LE:
movsx rdi, edi
movzx eax, BYTE PTR [rsi+1+rdi]
movzx edx, BYTE PTR [rsi+2+rdi]
sal eax, 8
sal edx, 16
or eax, edx
movzx edx, BYTE PTR [rsi+rdi]
or eax, edx
movzx edx, BYTE PTR [rsi+3+rdi]
sal edx, 24
or eax, edx
ret
LLVM manages this:
HeaderReadU32LE:
movsxd rax, edi
mov eax, dword ptr [rsi + rax]
ret
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load
2021-09-17 22:32 [Bug tree-optimization/102391] New: Failure to optimize 2 8-bit loads into a single 16-bit load gabravier at gmail dot com
2021-09-18 0:03 ` [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load gabravier at gmail dot com
@ 2021-09-18 0:14 ` pinskia at gcc dot gnu.org
2021-09-20 8:42 ` rguenth at gcc dot gnu.org
2021-12-15 23:31 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-18 0:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102391
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Ever confirmed|0 |1
Last reconfirmed| |2021-09-18
Status|UNCONFIRMED |NEW
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
GCC can figure out case offset = 0;
There might be a dup of this one too.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load
2021-09-17 22:32 [Bug tree-optimization/102391] New: Failure to optimize 2 8-bit loads into a single 16-bit load gabravier at gmail dot com
2021-09-18 0:03 ` [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load gabravier at gmail dot com
2021-09-18 0:14 ` pinskia at gcc dot gnu.org
@ 2021-09-20 8:42 ` rguenth at gcc dot gnu.org
2021-12-15 23:31 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-20 8:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102391
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
the bswap pass is in principle able to handle these but it sees
_1 = (sizetype) offset_12(D);
_2 = RomHeader_13(D) + _1;
_3 = *_2;
_4 = (signed short) _3;
_5 = _1 + 1;
_6 = RomHeader_13(D) + _5;
_7 = *_6;
so the constant offset is not forwarded to the MEM_REFs (int vs. size_t issue)
and the bswap pass doesn't perform any fancy dataref analysis to spot
constant offsetted same bases (it could simply use split_constant_offset
on the found base I guess or invoke DR analysis in BB mode).
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load
2021-09-17 22:32 [Bug tree-optimization/102391] New: Failure to optimize 2 8-bit loads into a single 16-bit load gabravier at gmail dot com
` (2 preceding siblings ...)
2021-09-20 8:42 ` rguenth at gcc dot gnu.org
@ 2021-12-15 23:31 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-15 23:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102391
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |DUPLICATE
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup of bug 98953.
*** This bug has been marked as a duplicate of bug 98953 ***
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-12-15 23:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-17 22:32 [Bug tree-optimization/102391] New: Failure to optimize 2 8-bit loads into a single 16-bit load gabravier at gmail dot com
2021-09-18 0:03 ` [Bug tree-optimization/102391] Failure to optimize adjacent 8-bit loads into a single bigger load gabravier at gmail dot com
2021-09-18 0:14 ` pinskia at gcc dot gnu.org
2021-09-20 8:42 ` rguenth at gcc dot gnu.org
2021-12-15 23:31 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).