public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/66890] New: function splitting only works with profile feedback
@ 2015-07-15 22:17 andi-gcc at firstfloor dot org
2015-07-16 7:05 ` [Bug rtl-optimization/66890] " andi-gcc at firstfloor dot org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: andi-gcc at firstfloor dot org @ 2015-07-15 22:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66890
Bug ID: 66890
Summary: function splitting only works with profile feedback
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Consider this simple example:
volatile int count;
int main()
{
int i;
for (i = 0; i < 100000; i++) {
if (i == 999)
count *= 2;
count++;
}
}
The default EQ is unlikely heuristic in predict.* predicts that the if (i ==
999) is unlikely. So the tracer moves the count *= 2 basic block out of line to
preserve instruction cache.
gcc50 -O2 -S thotcold.c
movl $1, %edx
jmp .L2
.p2align 4,,10
.p2align 3
.L4:
addl $1, %edx
.L2:
cmpl $1000, %edx
movl count(%rip), %eax
je .L6
addl $1, %eax
cmpl $100000, %edx
movl %eax, count(%rip)
jne .L4
xorl %eax, %eax
ret
# out of line code
.L6:
addl %eax, %eax
movl %eax, count(%rip)
movl count(%rip), %eax
addl $1, %eax
movl %eax, count(%rip)
jmp .L4
Now if we enable -freorder-blocks-and-partition I would expect it to be also
put into .text.unlikely to given even better cache layout. But that's what is
not happening. It generates the same code.
Only when I use actual profile feedback and -freorder-blocks-and-partition the
code actually ends up being in a separate section
(it also unrolled the loop, so the code looks a bit different)
gcc -O2 -fprofile-generate -freorder-blocks-and-partition thotcold.c
./a.out
gcc -O2 -fprofile-use -freorder-blocks-and-partition thotcold.c
...
.cfi_endproc
.section .text.unlikely
.cfi_startproc
.L55:
movl count(%rip), %ecx
addl $1, %eax
addl $1, %ecx
cmpl $100000, %eax
movl %ecx, count(%rip)
je .L6
cmpl $1, %edx
je .L5
cmpl $2, %edx
je .L28
cmpl $3, %edx
-freorder-blocks-and-partition should already use the extra section even
without profile feedback.
I tested some larger programs and without profile feedback the unlikely section
is always empty.
The heuristics in predict.* often work quite well and a lot of code would
benefit from moving cold code out of the way of the caches.
This would allow to use the option to improve frontend bound codes without
needing to do full profile feedback.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug rtl-optimization/66890] function splitting only works with profile feedback
2015-07-15 22:17 [Bug rtl-optimization/66890] New: function splitting only works with profile feedback andi-gcc at firstfloor dot org
@ 2015-07-16 7:05 ` andi-gcc at firstfloor dot org
2015-07-17 17:42 ` andi-gcc at firstfloor dot org
2023-05-16 22:26 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: andi-gcc at firstfloor dot org @ 2015-07-16 7:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66890
--- Comment #3 from Andi Kleen <andi-gcc at firstfloor dot org> ---
I suspect the patch may be too simple because it could get stuck in unlikely,
but high frequency edges in the cold area. Perhaps need to adapt more of the
code of the non partitioning reordering
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug rtl-optimization/66890] function splitting only works with profile feedback
2015-07-15 22:17 [Bug rtl-optimization/66890] New: function splitting only works with profile feedback andi-gcc at firstfloor dot org
2015-07-16 7:05 ` [Bug rtl-optimization/66890] " andi-gcc at firstfloor dot org
@ 2015-07-17 17:42 ` andi-gcc at firstfloor dot org
2023-05-16 22:26 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: andi-gcc at firstfloor dot org @ 2015-07-17 17:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66890
--- Comment #4 from Andi Kleen <andi-gcc at firstfloor dot org> ---
Created attachment 36008
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36008&action=edit
Updated patch with documentation and param
I updated the patch with proper documentation and a param for the cut off.
In some tests it appears to do the right thing when building a Linux kernel.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug rtl-optimization/66890] function splitting only works with profile feedback
2015-07-15 22:17 [Bug rtl-optimization/66890] New: function splitting only works with profile feedback andi-gcc at firstfloor dot org
2015-07-16 7:05 ` [Bug rtl-optimization/66890] " andi-gcc at firstfloor dot org
2015-07-17 17:42 ` andi-gcc at firstfloor dot org
@ 2023-05-16 22:26 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-16 22:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66890
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Really for this loop, I would have assume to be split into 3 different loops
like:
volatile int count;
int main()
{
int i;
for (i = 0; i < 999; i++) {
if (i == 999)
count *= 2;
count++;
}
for (; i < 999+1; i++) {
if (i == 999)
count *= 2;
count++;
}
for (; i < 100000; i++) {
if (i == 999)
count *= 2;
count++;
}
}
And then it would not have an extra branch inside the loop itself either.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-05-16 22:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-15 22:17 [Bug rtl-optimization/66890] New: function splitting only works with profile feedback andi-gcc at firstfloor dot org
2015-07-16 7:05 ` [Bug rtl-optimization/66890] " andi-gcc at firstfloor dot org
2015-07-17 17:42 ` andi-gcc at firstfloor dot org
2023-05-16 22:26 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).