public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/111563] New: Missed optimization of LICM
@ 2023-09-24 3:42 652023330028 at smail dot nju.edu.cn
2023-09-24 3:54 ` [Bug tree-optimization/111563] " pinskia at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2023-09-24 3:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111563
Bug ID: 111563
Summary: Missed optimization of LICM
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: 652023330028 at smail dot nju.edu.cn
Target Milestone: ---
Hello, we found some optimizations (regarding Loop Invariant Code Motion) that
GCC may have missed. We would greatly appreicate if you can take a look and let
us know what you think.
Here are two examples.
Example 1:
https://godbolt.org/z/cn9v45vdh
Given the following code:
```c++
extern int var_12;
extern int var_13;
extern int var_17;
void test(int var_0, int var_1, int var_5, int var_6, int var_9) {
for (int i_4 = 1; i_4 < var_0-2147483638; i_4 += 1)
{
var_17 += 1774623354;
if (var_9)
{
var_13 += var_1;
var_17 += var_6 + var_0;
var_12 += var_5;
}
else
{
var_17 += var_6;
}
i_4+=i_4/3;;
}
}
```
In fact, `var_6 + var_0` is a loop invariant, but gcc-trunk -O3:
```asm
test(int, int, int, int, int):
cmp edi, 2147483639
jle .L15
mov r9d, DWORD PTR var_17[rip]
lea r10d, [rdi-2147483638]
test r8d, r8d
jne .L4
lea edi, [rcx+1774623354]
mov edx, 1
lea ecx, [r9+1774623354+rcx]
.L5:
movsx rax, edx
mov esi, edx
imul rax, rax, 1431655766
sar esi, 31
shr rax, 32
sub eax, esi
lea edx, [rdx+1+rax]
mov eax, ecx
add ecx, edi
cmp r10d, edx
jg .L5
mov DWORD PTR var_17[rip], eax
ret
.L15:
ret
.L4:
mov eax, edi
mov edi, DWORD PTR var_13[rip]
push rbx
mov ebx, esi
mov r11d, edx
lea r8d, [rax+1774623354+rcx]
add edi, esi
mov esi, DWORD PTR var_12[rip]
add esi, edx
mov edx, 1
.L7:
movsx rax, edx
mov ecx, edx
add r9d, r8d
imul rax, rax, 1431655766
sar ecx, 31
shr rax, 32
sub eax, ecx
mov ecx, esi
add esi, r11d
lea edx, [rax+1+rdx]
mov eax, edi
add edi, ebx
cmp edx, r10d
jl .L7
mov DWORD PTR var_17[rip], r9d
pop rbx
mov DWORD PTR var_12[rip], ecx
mov DWORD PTR var_13[rip], eax
ret
```
Example 2:
https://godbolt.org/z/418K59eoE
Given the following code:
```c++
extern int var_24;
void test(int var_2, int var_3, int var_8, int var_10, int var_14) {
for (int i_2 = -3247424; i_2 < 19; i_2 += var_3 + 1056714155)
{
if(var_3){
var_24 += (-(200 / var_10)) + (-var_8);
var_24 += var_14 + var_2;
}
i_2+=i_2/3;
}
}
```
We note that `(-(200 / var_10)) + (-var_8) + var_14 + var_2` as a whole can be
treated as a loop invariant, but gcc-trunk -O3 does not:
```asm
test(int, int, int, int, int):
mov r10d, edi
mov r11d, edx
test esi, esi
je .L1
mov eax, -200
mov edx, -1
mov edi, DWORD PTR var_24[rip]
add r8d, r10d
idiv ecx
lea r9d, [rsi+1056714155]
mov ecx, -3247424
sub eax, r11d
.L3:
movsx rdx, ecx
mov esi, ecx
add edi, eax
imul rdx, rdx, 1431655766
sar esi, 31
add edi, r8d
shr rdx, 32
sub edx, esi
add ecx, edx
add ecx, r9d
cmp ecx, 18
jle .L3
mov DWORD PTR var_24[rip], edi
.L1:
ret
```
Thank you very much for your time and effort! We look forward to hearing from
you.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/111563] Missed optimization of LICM
2023-09-24 3:42 [Bug tree-optimization/111563] New: Missed optimization of LICM 652023330028 at smail dot nju.edu.cn
@ 2023-09-24 3:54 ` pinskia at gcc dot gnu.org
2023-09-24 16:01 ` 652023330028 at smail dot nju.edu.cn
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-24 3:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111563
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
_5 = var_0_16(D) + var_6_18(D);
invariant up to level 1, cost 1.
Basically because the cost is not high enough ...
If you use --param=lim-expensive=1. then it will pull it out of the loop.
So the cost model is doing the correct thing here ...
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/111563] Missed optimization of LICM
2023-09-24 3:42 [Bug tree-optimization/111563] New: Missed optimization of LICM 652023330028 at smail dot nju.edu.cn
2023-09-24 3:54 ` [Bug tree-optimization/111563] " pinskia at gcc dot gnu.org
@ 2023-09-24 16:01 ` 652023330028 at smail dot nju.edu.cn
2023-09-24 16:33 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2023-09-24 16:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111563
--- Comment #2 from Yi <652023330028 at smail dot nju.edu.cn> ---
(In reply to Andrew Pinski from comment #1)
> _5 = var_0_16(D) + var_6_18(D);
> invariant up to level 1, cost 1.
>
> Basically because the cost is not high enough ...
>
> If you use --param=lim-expensive=1. then it will pull it out of the loop.
>
> So the cost model is doing the correct thing here ...
Thank you very much for your prompt reply! It took me some time to confirm our
work. For Example 1, GCC does exactly what you say it does. But for Example 2,
it doesn't seem to work as expected.
https://godbolt.org/z/eeWThnqWs
Given the following code:
```c++
extern int var_24;
void test(int var_2, int var_3, int var_8, int var_10, int var_14) {
for (int i_2 = -3247424; i_2 < 19; i_2 += var_3 + 1056714155)
{
if(var_3){
var_24 += (-(200 / var_10)) + (-var_8);
var_24 += var_14 + var_2;
}
i_2+=i_2/3;
}
}
```
We note that `(-(200 / var_10)) + (-var_8) + var_14 + var_2` as a whole can be
treated as a loop invariant, but gcc-trunk -O3 --param=lim-expensive=1 does
not:
```asm
test(int, int, int, int, int):
mov r10d, edx
test esi, esi
je .L1
mov eax, -200
mov edx, -1
mov r9d, DWORD PTR var_24[rip]
add r8d, edi
idiv ecx
lea edi, [rsi+1056714155]
mov ecx, -3247424
sub eax, r10d
.L3:
movsx rdx, ecx
mov esi, ecx
add r9d, eax # ... + (-(200 / var_10)) + (-var_8)
imul rdx, rdx, 1431655766
sar esi, 31
add r9d, r8d # ... + var_14 + var_2
shr rdx, 32
sub edx, esi
add ecx, edx
add ecx, edi
cmp ecx, 18
jle .L3
mov DWORD PTR var_24[rip], r9d
.L1:
ret
```
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/111563] Missed optimization of LICM
2023-09-24 3:42 [Bug tree-optimization/111563] New: Missed optimization of LICM 652023330028 at smail dot nju.edu.cn
2023-09-24 3:54 ` [Bug tree-optimization/111563] " pinskia at gcc dot gnu.org
2023-09-24 16:01 ` 652023330028 at smail dot nju.edu.cn
@ 2023-09-24 16:33 ` pinskia at gcc dot gnu.org
2023-09-24 16:38 ` 652023330028 at smail dot nju.edu.cn
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-24 16:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111563
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Yi from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > _5 = var_0_16(D) + var_6_18(D);
> > invariant up to level 1, cost 1.
> >
> > Basically because the cost is not high enough ...
> >
> > If you use --param=lim-expensive=1. then it will pull it out of the loop.
> >
> > So the cost model is doing the correct thing here ...
>
>
> Thank you very much for your prompt reply! It took me some time to confirm
> our work. For Example 1, GCC does exactly what you say it does. But for
> Example 2, it doesn't seem to work as expected.
>
> We note that `(-(200 / var_10)) + (-var_8) + var_14 + var_2` as a whole can
> be treated as a loop invariant, but gcc-trunk -O3 --param=lim-expensive=1
> does not:
_4 = _2 + var_24_lsm.5_21;
_6 = _4 + _5;
So this is again reassociation with LIM, the same issue as PR 111560.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/111563] Missed optimization of LICM
2023-09-24 3:42 [Bug tree-optimization/111563] New: Missed optimization of LICM 652023330028 at smail dot nju.edu.cn
` (2 preceding siblings ...)
2023-09-24 16:33 ` pinskia at gcc dot gnu.org
@ 2023-09-24 16:38 ` 652023330028 at smail dot nju.edu.cn
2023-09-25 10:22 ` 652023330028 at smail dot nju.edu.cn
2023-09-29 0:40 ` 652023330028 at smail dot nju.edu.cn
5 siblings, 0 replies; 7+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2023-09-24 16:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111563
--- Comment #4 from Yi <652023330028 at smail dot nju.edu.cn> ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Yi from comment #2)
> > (In reply to Andrew Pinski from comment #1)
> > > _5 = var_0_16(D) + var_6_18(D);
> > > invariant up to level 1, cost 1.
> > >
> > > Basically because the cost is not high enough ...
> > >
> > > If you use --param=lim-expensive=1. then it will pull it out of the loop.
> > >
> > > So the cost model is doing the correct thing here ...
> >
> >
> > Thank you very much for your prompt reply! It took me some time to confirm
> > our work. For Example 1, GCC does exactly what you say it does. But for
> > Example 2, it doesn't seem to work as expected.
> >
> > We note that `(-(200 / var_10)) + (-var_8) + var_14 + var_2` as a whole can
> > be treated as a loop invariant, but gcc-trunk -O3 --param=lim-expensive=1
> > does not:
>
> _4 = _2 + var_24_lsm.5_21;
> _6 = _4 + _5;
>
> So this is again reassociation with LIM, the same issue as PR 111560.
I see. Thank you very much for your reply!
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/111563] Missed optimization of LICM
2023-09-24 3:42 [Bug tree-optimization/111563] New: Missed optimization of LICM 652023330028 at smail dot nju.edu.cn
` (3 preceding siblings ...)
2023-09-24 16:38 ` 652023330028 at smail dot nju.edu.cn
@ 2023-09-25 10:22 ` 652023330028 at smail dot nju.edu.cn
2023-09-29 0:40 ` 652023330028 at smail dot nju.edu.cn
5 siblings, 0 replies; 7+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2023-09-25 10:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111563
--- Comment #5 from Yi <652023330028 at smail dot nju.edu.cn> ---
(In reply to Andrew Pinski from comment #3)
> So this is again reassociation with LIM, the same issue as PR 111560.
For this similar code, GCC works as expected:
https://godbolt.org/z/3TaqfeTqb
```c++
extern int var_24;
int t;
void test(int var_2, int var_3, int var_8, int var_10, int var_14) {
for (int i_2 = -3247424; i_2 < 19; i_2 += var_3 + 1056714155)
{
var_24 += (-(200 / var_10)) + (-var_8);
var_24 += var_14 + var_2;
i_2+=i_2/3;
}
}
```
So it seems that this and PR 111560 may not be due to the same cause.
Because it doesn't seem to be relevant to the statement, "Our re-association
only produces a canonical order within a single expression."
Meanwhile, in Example 2, 'if(var_3)' is actually optimized out of the Loop by
Loop Unswitch. So maybe the rest of the loop should be optimized as expected
like this similar code?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/111563] Missed optimization of LICM
2023-09-24 3:42 [Bug tree-optimization/111563] New: Missed optimization of LICM 652023330028 at smail dot nju.edu.cn
` (4 preceding siblings ...)
2023-09-25 10:22 ` 652023330028 at smail dot nju.edu.cn
@ 2023-09-29 0:40 ` 652023330028 at smail dot nju.edu.cn
5 siblings, 0 replies; 7+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2023-09-29 0:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111563
--- Comment #6 from Yi <652023330028 at smail dot nju.edu.cn> ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Yi from comment #2)
> > (In reply to Andrew Pinski from comment #1)
> > > _5 = var_0_16(D) + var_6_18(D);
> > > invariant up to level 1, cost 1.
> > >
> > > Basically because the cost is not high enough ...
> > >
> > > If you use --param=lim-expensive=1. then it will pull it out of the loop.
> > >
> > > So the cost model is doing the correct thing here ...
> >
> >
> > Thank you very much for your prompt reply! It took me some time to confirm
> > our work. For Example 1, GCC does exactly what you say it does. But for
> > Example 2, it doesn't seem to work as expected.
> >
> > We note that `(-(200 / var_10)) + (-var_8) + var_14 + var_2` as a whole can
> > be treated as a loop invariant, but gcc-trunk -O3 --param=lim-expensive=1
> > does not:
>
> _4 = _2 + var_24_lsm.5_21;
> _6 = _4 + _5;
>
> So this is again reassociation with LIM, the same issue as PR 111560.
Hi Andrew,
We would greatly appreicate if you can help us here.
extern int var_24;
void test(int var_2, int var_3, int var_8, int var_10, int var_14) {
for (int i_2 = -3247424; i_2 < 19; i_2 += var_3 + 1056714155)
{
if(var_3){
var_24 += (-(200 / var_10)) + (-var_8);
var_24 += var_14 + var_2;
}
i_2+=i_2/3;
}
}
Again, for the above code, if it were the reassociation issue that causes the
loop invariant not being moved out of the loop,
the following code should have the same problem. However, the code below is
being optimized properly, thereby, there
seems to be some other factors at play here. It would be great if you can
confirm it or (if possible) fix it. Thank you
so much!
extern int var_24;
int t;
void test(int var_2, int var_3, int var_8, int var_10, int var_14) {
for (int i_2 = -3247424; i_2 < 19; i_2 += var_3 + 1056714155)
{
var_24 += (-(200 / var_10)) + (-var_8);
var_24 += var_14 + var_2;
i_2+=i_2/3;
}
}
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-09-29 0:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-24 3:42 [Bug tree-optimization/111563] New: Missed optimization of LICM 652023330028 at smail dot nju.edu.cn
2023-09-24 3:54 ` [Bug tree-optimization/111563] " pinskia at gcc dot gnu.org
2023-09-24 16:01 ` 652023330028 at smail dot nju.edu.cn
2023-09-24 16:33 ` pinskia at gcc dot gnu.org
2023-09-24 16:38 ` 652023330028 at smail dot nju.edu.cn
2023-09-25 10:22 ` 652023330028 at smail dot nju.edu.cn
2023-09-29 0:40 ` 652023330028 at smail dot nju.edu.cn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).