* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
@ 2024-03-03 3:06 ` pinskia at gcc dot gnu.org
2024-03-03 3:18 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03 3:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 57597
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57597&action=edit
Testcase
Please next time attach the testcase rather than just link to godbolt.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
2024-03-03 3:06 ` [Bug ipa/114215] " pinskia at gcc dot gnu.org
@ 2024-03-03 3:18 ` pinskia at gcc dot gnu.org
2024-03-03 3:27 ` unlvsur at live dot com
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03 3:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2024-03-03
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
checkedvector::value_type& checkedvector::operator[](size_type)/167 call is
unlikely and code size would grow
freq:1.00 loop depth: 0 size: 4 time: 13 callee size: 6 stack: 0
op1 is compile time invariant
If I had:
```
void test_demovector(checkedvector& vec, int x) noexcept
{
for(int i = 0; i < y; i++)
vec[i]=5;
}
```
Then GCC will inling operator[]:
checkedvector::value_type& checkedvector::operator[](size_type)/167 inlined
freq:8.09
Stack frame offset 0, callee self size 0
void __builtin_trap()/205 function body not available
freq:0.00 loop depth: 1 size: 1 time: 1
I am suspecting this is the right heurstic. Do you have real code where the
inlining does not happen at -Os/-Oz or you just looking at the code generation
with code that might be benchmarking things?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
2024-03-03 3:06 ` [Bug ipa/114215] " pinskia at gcc dot gnu.org
2024-03-03 3:18 ` pinskia at gcc dot gnu.org
@ 2024-03-03 3:27 ` unlvsur at live dot com
2024-03-03 3:30 ` unlvsur at live dot com
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03 3:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
--- Comment #3 from cqwrteur <unlvsur at live dot com> ---
test_demovector(checkedvector&):
pushq %rbx
movq %rdi, %rbx
pushq $4
popq %rsi
call checkedvector::operator[](unsigned long)
movq %rbx, %rdi
movq $5, (%rax)
pushq $6
popq %rsi
call checkedvector::operator[](unsigned long)
movq %rbx, %rdi
movq $5, (%rax)
pushq $10
popq %rsi
call checkedvector::operator[](unsigned long)
movq %rbx, %rdi
movq $5, (%rax)
pushq $5
popq %rsi
call checkedvector::operator[](unsigned long)
movq $5, (%rax)
popq %rbx
ret
test_demovector_forceinline(checkedvector&):
movq (%rdi), %rax
movq 8(%rdi), %rdx
subq %rax, %rdx
cmpq $32, %rdx
ja .L7
.L8:
ud2
.L7:
movq $5, 32(%rax)
cmpq $48, %rdx
jbe .L8
movq $5, 48(%rax)
cmpq $80, %rdx
jbe .L8
movq $5, 80(%rax)
movq $5, 40(%rax)
ret
see? first one has more instructions than the 2nd one.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
` (2 preceding siblings ...)
2024-03-03 3:27 ` unlvsur at live dot com
@ 2024-03-03 3:30 ` unlvsur at live dot com
2024-03-03 3:36 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03 3:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
--- Comment #4 from cqwrteur <unlvsur at live dot com> ---
void test_demovector(checkedvector& vec, __SIZE_TYPE__ x) noexcept
{
for(__SIZE_TYPE__ i = 0; i < x; i++)
vec[i]=5;
}
void test_demovector_forceinline(checkedvector& vec, __SIZE_TYPE__ x) noexcept
{
for(__SIZE_TYPE__ i = 0; i < x; i++)
vec.index_forceinline(i)=5;
}
still more instructions for the first one even with a loop.
test_demovector(checkedvector&, unsigned long):
pushq %r12
movq %rdi, %r12
pushq %rbp
movq %rsi, %rbp
pushq %rbx
xorl %ebx, %ebx
.L10:
cmpq %rbp, %rbx
je .L13
movq %rbx, %rsi
movq %r12, %rdi
incq %rbx
call checkedvector::operator[](unsigned long)
movq $5, (%rax)
jmp .L10
.L13:
popq %rbx
popq %rbp
popq %r12
ret
test_demovector_forceinline(checkedvector&, unsigned long):
xorl %eax, %eax
.L15:
cmpq %rsi, %rax
je .L18
movq (%rdi), %rcx
movq 8(%rdi), %rdx
subq %rcx, %rdx
sarq $3, %rdx
cmpq %rdx, %rax
jb .L16
ud2
.L16:
movq $5, (%rcx,%rax,8)
incq %rax
jmp .L15
.L18:
ret
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
` (3 preceding siblings ...)
2024-03-03 3:30 ` unlvsur at live dot com
@ 2024-03-03 3:36 ` pinskia at gcc dot gnu.org
2024-03-03 3:44 ` [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions unlvsur at live dot com
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03 3:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Still waiting on a full application rather then small benchmark type sources.
The heurstic here is that if you call operator[] multiple times, it might be
better not to inline it for size reasons.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
` (4 preceding siblings ...)
2024-03-03 3:36 ` pinskia at gcc dot gnu.org
@ 2024-03-03 3:44 ` unlvsur at live dot com
2024-03-03 3:45 ` unlvsur at live dot com
2024-03-06 2:23 ` unlvsur at live dot com
7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03 3:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
--- Comment #6 from cqwrteur <unlvsur at live dot com> ---
(In reply to Andrew Pinski from comment #5)
> Still waiting on a full application rather then small benchmark type
> sources. The heurstic here is that if you call operator[] multiple times, it
> might be better not to inline it for size reasons.
I know. but here the function is too small to the point it is always better to
inline it. Because all it does is an efficient bounds checking. You do not want
bounds checking to be a function call.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
` (5 preceding siblings ...)
2024-03-03 3:44 ` [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions unlvsur at live dot com
@ 2024-03-03 3:45 ` unlvsur at live dot com
2024-03-06 2:23 ` unlvsur at live dot com
7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03 3:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
--- Comment #7 from cqwrteur <unlvsur at live dot com> ---
__builtin_trap() is just to crash the program.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions
2024-03-03 2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
` (6 preceding siblings ...)
2024-03-03 3:45 ` unlvsur at live dot com
@ 2024-03-06 2:23 ` unlvsur at live dot com
7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-06 2:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215
--- Comment #8 from cqwrteur <unlvsur at live dot com> ---
(In reply to Andrew Pinski from comment #5)
> Still waiting on a full application rather then small benchmark type
> sources. The heurstic here is that if you call operator[] multiple times, it
> might be better not to inline it for size reasons.
llvm actually makes different decision here. Probably GCC thinks __builtin_trap
is expensive despite it is not.
^ permalink raw reply [flat|nested] 9+ messages in thread