public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/112117] New: Missed optimization of LICM that might need to be combined with partial loop unrolling
@ 2023-10-28 12:24 652023330028 at smail dot nju.edu.cn
2023-10-30 13:16 ` [Bug tree-optimization/112117] " rguenth at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2023-10-28 12:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112117
Bug ID: 112117
Summary: Missed optimization of LICM that might need to be
combined with partial loop unrolling
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: 652023330028 at smail dot nju.edu.cn
Target Milestone: ---
Hello, we found some optimizations (regarding LICM) that GCC may have missed.
We would greatly appreicate if you can take a look and let us know what you
think.
Given the following code:
https://godbolt.org/z/sGYhMKrzK
int a, b, c, d;
int m,n;
void test() {
for (int i = 0; i < 1000; i += 1) {
d += c + a;
a = c;
}
}
We can notice that the value of c+a is unchanged except when entering the loop
for the first time.
But GCC -O3 -fwrapv generates code that looks inefficient:
test():
mov esi, DWORD PTR c[rip]
mov eax, DWORD PTR a[rip]
mov edx, 1000
mov ecx, DWORD PTR d[rip]
jmp .L2
.L3:
mov eax, esi
.L2:
add eax, esi
add ecx, eax
sub edx, 1
jne .L3
mov DWORD PTR d[rip], ecx
mov DWORD PTR a[rip], esi
ret
Our analysis of LLVM shows that this problem may be solved by partial loop
unrolling followed by LICM.
Thank you very much for your time and effort! We look forward to hearing from
you.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug tree-optimization/112117] Missed optimization of LICM that might need to be combined with partial loop unrolling
2023-10-28 12:24 [Bug tree-optimization/112117] New: Missed optimization of LICM that might need to be combined with partial loop unrolling 652023330028 at smail dot nju.edu.cn
@ 2023-10-30 13:16 ` rguenth at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-30 13:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112117
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2023-10-30
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed. Peeling a single iteration might look like the easiest way here,
the trick is to see we can then eliminate an add and a PHI node in the loop.
Note with the current pass ordering this would come quite late since the
required store-motion is performed only before PRE. It could also be
seen as hoisting opportunity, transforming
# a_lsm.7_16 = PHI <c.0_1(3), a_lsm.7_6(2)>
_3 = c.0_1 + a_lsm.7_16;
into
tem1 = c.0_1 + c.0_1;
tem2 = c.0_1 + a_lsm.7_6;
# _3 = PHI <tem1(3), tem2(2)>
but the rest wouldn't reduce as nicely to a multiplication then.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-10-30 13:16 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-28 12:24 [Bug tree-optimization/112117] New: Missed optimization of LICM that might need to be combined with partial loop unrolling 652023330028 at smail dot nju.edu.cn
2023-10-30 13:16 ` [Bug tree-optimization/112117] " rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).