public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/40093] New: Optimization by functios reordering.
@ 2009-05-10 16:39 vvv at ru dot ru
2009-05-10 16:43 ` [Bug c/40093] " vvv at ru dot ru
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: vvv at ru dot ru @ 2009-05-10 16:39 UTC (permalink / raw)
To: gcc-bugs
Because memory controller prefetch memory blocks, execution time of functions
calls sequence depend on order this functions in memory. For example:
4 calls:
call func1
call func2
call func3
call func4
faster in case of direct functions order in memmory:
.p2align 4
func1:
ret
.p2align 4
func2:
ret
.p2align 4
func3:
ret
.p2align 4
func4:
ret
and slow in case inverse order:
.p2align 4
func4:
ret
.p2align 4
func3:
ret
.p2align 4
func2:
ret
.p2align 4
func1:
ret
Unfortunately, inverse order is typical for C/C++.
what do you think about this kind optimization?
--
Summary: Optimization by functios reordering.
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: vvv at ru dot ru
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40093
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug c/40093] Optimization by functios reordering.
2009-05-10 16:39 [Bug c/40093] New: Optimization by functios reordering vvv at ru dot ru
@ 2009-05-10 16:43 ` vvv at ru dot ru
2009-05-10 16:49 ` [Bug middle-end/40093] " pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: vvv at ru dot ru @ 2009-05-10 16:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from vvv at ru dot ru 2009-05-10 16:43 -------
Created an attachment (id=17847)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17847&action=view)
Example direct/inverse calls
Simple example. RDTSC ticks for direct and inverse sequence of calls.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40093
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug middle-end/40093] Optimization by functios reordering.
2009-05-10 16:39 [Bug c/40093] New: Optimization by functios reordering vvv at ru dot ru
2009-05-10 16:43 ` [Bug c/40093] " vvv at ru dot ru
@ 2009-05-10 16:49 ` pinskia at gcc dot gnu dot org
2009-05-10 18:08 ` vvv at ru dot ru
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-05-10 16:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2009-05-10 16:49 -------
This should have been done already with cgraph order.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|c |middle-end
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40093
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug middle-end/40093] Optimization by functios reordering.
2009-05-10 16:39 [Bug c/40093] New: Optimization by functios reordering vvv at ru dot ru
2009-05-10 16:43 ` [Bug c/40093] " vvv at ru dot ru
2009-05-10 16:49 ` [Bug middle-end/40093] " pinskia at gcc dot gnu dot org
@ 2009-05-10 18:08 ` vvv at ru dot ru
2009-05-10 18:10 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: vvv at ru dot ru @ 2009-05-10 18:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from vvv at ru dot ru 2009-05-10 18:08 -------
(In reply to comment #2)
> This should have been done already with cgraph order.
Unfortunately, I can see inverse order only in separate source file. Inverse
but not optimized.
Example:
// file order1.c
#include <stdio.h>
main(int argc, char **argv)
{int i,j,k,l;
i=func1();
j=func2();
k=func3();
l=func4();
printf("%d %d %d %d\n",i,j,k,l);
}
=====================================
// file order2.c
int func1(){ return(F(4));}
int func2(){ return(F(3));}
int func3(){ return(F(2));}
int func4(){ return(F(1));}
=====================================
// file order3.c
int F(int x){ return(x);}
# gcc --version
gcc (GCC) 4.5.0 20090508 (experimental)
# gcc -o order order3.c order2.c order1.c -O2
# objdump -d order
0000000000400520 <F>:
400520: 89 f8 mov %edi,%eax
400522: c3 retq
0000000000400530 <func4>:
400530: bf 01 00 00 00 mov $0x1,%edi
400535: 31 c0 xor %eax,%eax
400537: e9 e4 ff ff ff jmpq 400520 <F>
0000000000400540 <func3>:
400540: bf 02 00 00 00 mov $0x2,%edi
400545: 31 c0 xor %eax,%eax
400547: e9 d4 ff ff ff jmpq 400520 <F>
0000000000400550 <func2>:
400550: bf 03 00 00 00 mov $0x3,%edi
400555: 31 c0 xor %eax,%eax
400557: e9 c4 ff ff ff jmpq 400520 <F>
0000000000400560 <func1>:
400560: bf 04 00 00 00 mov $0x4,%edi
400565: 31 c0 xor %eax,%eax
400567: e9 b4 ff ff ff jmpq 400520 <F>
0000000000400570 <main>:
400570: 48 89 5c 24 e8 mov %rbx,-0x18(%rsp)
400575: 48 89 6c 24 f0 mov %rbp,-0x10(%rsp)
40057a: 31 c0 xor %eax,%eax
40057c: 4c 89 64 24 f8 mov %r12,-0x8(%rsp)
400581: 48 83 ec 18 sub $0x18,%rsp
400585: e8 d6 ff ff ff callq 400560 <func1>
40058a: 89 c3 mov %eax,%ebx
40058c: 31 c0 xor %eax,%eax
40058e: e8 bd ff ff ff callq 400550 <func2>
400593: 89 c5 mov %eax,%ebp
400595: 31 c0 xor %eax,%eax
400597: e8 a4 ff ff ff callq 400540 <func3>
40059c: 41 89 c4 mov %eax,%r12d
40059f: 31 c0 xor %eax,%eax
4005a1: e8 8a ff ff ff callq 400530 <func4>
4005a6: 44 89 e1 mov %r12d,%ecx
4005a9: 41 89 c0 mov %eax,%r8d
4005ac: 89 ea mov %ebp,%edx
4005ae: 89 de mov %ebx,%esi
4005b0: 48 8b 6c 24 08 mov 0x8(%rsp),%rbp
4005b5: 48 8b 1c 24 mov (%rsp),%rbx
4005b9: 4c 8b 64 24 10 mov 0x10(%rsp),%r12
4005be: bf bc 06 40 00 mov $0x4006bc,%edi
4005c3: 31 c0 xor %eax,%eax
4005c5: 48 83 c4 18 add $0x18,%rsp
4005c9: e9 42 fe ff ff jmpq 400410 <printf@plt>
=====================================
But optimal:
0000000000400520 <main>:
400520: 48 89 5c 24 e8 mov %rbx,-0x18(%rsp)
400525: 48 89 6c 24 f0 mov %rbp,-0x10(%rsp)
40052a: 31 c0 xor %eax,%eax
40052c: 4c 89 64 24 f8 mov %r12,-0x8(%rsp)
400531: 48 83 ec 18 sub $0x18,%rsp
400535: e8 46 00 00 00 callq 400580 <func1>
40053a: 89 c3 mov %eax,%ebx
40053c: 31 c0 xor %eax,%eax
40053e: e8 4d 00 00 00 callq 400590 <func2>
400543: 89 c5 mov %eax,%ebp
400545: 31 c0 xor %eax,%eax
400547: e8 54 00 00 00 callq 4005a0 <func3>
40054c: 41 89 c4 mov %eax,%r12d
40054f: 31 c0 xor %eax,%eax
400551: e8 5a 00 00 00 callq 4005b0 <func4>
400556: 44 89 e1 mov %r12d,%ecx
400559: 41 89 c0 mov %eax,%r8d
40055c: 89 ea mov %ebp,%edx
40055e: 89 de mov %ebx,%esi
400560: 48 8b 6c 24 08 mov 0x8(%rsp),%rbp
400565: 48 8b 1c 24 mov (%rsp),%rbx
400569: 4c 8b 64 24 10 mov 0x10(%rsp),%r12
40056e: bf bc 06 40 00 mov $0x4006bc,%edi
400573: 31 c0 xor %eax,%eax
400575: 48 83 c4 18 add $0x18,%rsp
400579: e9 92 fe ff ff jmpq 400410 <printf@plt>
0000000000400580 <func1>:
400580: bf 01 00 00 00 mov $0x1,%edi
400585: 31 c0 xor %eax,%eax
400587: e9 34 00 00 00 jmpq 4005c0 <F>
0000000000400590 <func2>:
400590: bf 02 00 00 00 mov $0x2,%edi
400595: 31 c0 xor %eax,%eax
400597: e9 24 00 00 00 jmpq 4005c0 <F>
00000000004005a0 <func3>:
4005a0: bf 03 00 00 00 mov $0x3,%edi
4005a5: 31 c0 xor %eax,%eax
4005a7: e9 14 00 00 00 jmpq 4005c0 <F>
00000000004005b0 <func4>:
4005b0: bf 04 00 00 00 mov $0x4,%edi
4005b5: 31 c0 xor %eax,%eax
4005b7: e9 04 00 00 00 jmpq 4005c0 <F>
00000000004005c0 <F>:
4005c0: 89 f8 mov %edi,%eax
4005c2: c3 retq
===========
BTW, how about reordering idea for data?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40093
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug middle-end/40093] Optimization by functios reordering.
2009-05-10 16:39 [Bug c/40093] New: Optimization by functios reordering vvv at ru dot ru
` (2 preceding siblings ...)
2009-05-10 18:08 ` vvv at ru dot ru
@ 2009-05-10 18:10 ` pinskia at gcc dot gnu dot org
2009-05-10 18:20 ` vvv at ru dot ru
2009-06-04 14:50 ` hubicka at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-05-10 18:10 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from pinskia at gcc dot gnu dot org 2009-05-10 18:10 -------
Well you need whole program to get the behavior which you want.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40093
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug middle-end/40093] Optimization by functios reordering.
2009-05-10 16:39 [Bug c/40093] New: Optimization by functios reordering vvv at ru dot ru
` (3 preceding siblings ...)
2009-05-10 18:10 ` pinskia at gcc dot gnu dot org
@ 2009-05-10 18:20 ` vvv at ru dot ru
2009-06-04 14:50 ` hubicka at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: vvv at ru dot ru @ 2009-05-10 18:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from vvv at ru dot ru 2009-05-10 18:20 -------
(In reply to comment #4)
> Well you need whole program to get the behavior which you want.
Yes. Of course, it's no problem for small single-programmer project, but it's
problem for big projects like Linux Kernel.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40093
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug middle-end/40093] Optimization by functios reordering.
2009-05-10 16:39 [Bug c/40093] New: Optimization by functios reordering vvv at ru dot ru
` (4 preceding siblings ...)
2009-05-10 18:20 ` vvv at ru dot ru
@ 2009-06-04 14:50 ` hubicka at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2009-06-04 14:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from hubicka at gcc dot gnu dot org 2009-06-04 14:50 -------
There is simple algoritm reordering functions so calls more commonly leads to
following function in memory. (just order calls by frequency and concatenate
them into sequences and then order sequences to promote forward calls).
The problem here is that this conflicts with the optimizations propagatng
information top-down across callgraph (like stack alignment).
So I can write easilly pass that will give desired order, but I don't want
production of RTL bodies to happen this order and thus we need some assembler
support for this to order stuff properly (gas has subsections that would do
fine). I was just bit lazy to implement this so far.
--
hubicka at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-06-04 14:50:20
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40093
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-06-04 14:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-10 16:39 [Bug c/40093] New: Optimization by functios reordering vvv at ru dot ru
2009-05-10 16:43 ` [Bug c/40093] " vvv at ru dot ru
2009-05-10 16:49 ` [Bug middle-end/40093] " pinskia at gcc dot gnu dot org
2009-05-10 18:08 ` vvv at ru dot ru
2009-05-10 18:10 ` pinskia at gcc dot gnu dot org
2009-05-10 18:20 ` vvv at ru dot ru
2009-06-04 14:50 ` hubicka at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).