public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug other/58863] New: for loop not aligned at -O2 or -O3
@ 2013-10-24 17:14 ali.baharev at gmail dot com
2013-10-24 17:23 ` [Bug other/58863] " pinskia at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: ali.baharev at gmail dot com @ 2013-10-24 17:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863
Bug ID: 58863
Summary: for loop not aligned at -O2 or -O3
Product: gcc
Version: 4.7.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: other
Assignee: unassigned at gcc dot gnu.org
Reporter: ali.baharev at gmail dot com
The for loop in work() is the hotspot:
const int LOOP_BOUND = 200000000;
__attribute__((noinline))
static int add(const int& x, const int& y) {
return x + y;
}
__attribute__((noinline))
static int work(int xval, int yval) {
int sum(0);
for (int i=0; i<LOOP_BOUND; ++i) {
int x(xval+sum);
int y(yval+sum);
int z = add(x, y);
sum += z;
}
return sum;
}
int main(int , char* argv[]) {
int result = work(*argv[1], *argv[2]);
return result;
}
Running
g++ -O2 main.cpp && objdump -d | c++filt
gives
400598: 41 8d 34 1c lea (%r12,%rbx,1),%esi
[...]
4005ab: 75 eb jne 400598 <work(int, int)+0x18>
According to the documentation:
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
-falign-loops Enabled at levels -O2, -O3.
By analyzing the assembly code, it looks like gcc aligns things to the next 16
byte boundary by default on this machine in other cases.
If I pass -falign-loops=16 it becomes:
4005a0: 41 8d 34 1c lea (%r12,%rbx,1),%esi
[...]
4005b3: 75 eb jne 4005a0 <work(int, int)+0x20>
I guess it is also supposed to look like this when just -O2 is passed, at least
that is what the documentation suggestes to me.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug other/58863] for loop not aligned at -O2 or -O3
2013-10-24 17:14 [Bug other/58863] New: for loop not aligned at -O2 or -O3 ali.baharev at gmail dot com
@ 2013-10-24 17:23 ` pinskia at gcc dot gnu.org
2013-10-24 17:31 ` ali.baharev at gmail dot com
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-10-24 17:23 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
We have:
.p2align 4,,10
.p2align 3
so the max number of bytes we will skip is 10 but still align it to a 8 byte
boundary.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug other/58863] for loop not aligned at -O2 or -O3
2013-10-24 17:14 [Bug other/58863] New: for loop not aligned at -O2 or -O3 ali.baharev at gmail dot com
2013-10-24 17:23 ` [Bug other/58863] " pinskia at gcc dot gnu.org
@ 2013-10-24 17:31 ` ali.baharev at gmail dot com
2013-10-24 17:33 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: ali.baharev at gmail dot com @ 2013-10-24 17:31 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863
--- Comment #2 from Ali Baharev <ali.baharev at gmail dot com> ---
Please check with objdump. It's not what I get in the executable.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug other/58863] for loop not aligned at -O2 or -O3
2013-10-24 17:14 [Bug other/58863] New: for loop not aligned at -O2 or -O3 ali.baharev at gmail dot com
2013-10-24 17:23 ` [Bug other/58863] " pinskia at gcc dot gnu.org
2013-10-24 17:31 ` ali.baharev at gmail dot com
@ 2013-10-24 17:33 ` pinskia at gcc dot gnu.org
2013-10-24 17:37 ` ali.baharev at gmail dot com
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-10-24 17:33 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Ali Baharev from comment #2)
> Please check with objdump. It's not what I get in the executable.
Yes it is. Read my comment again. we align loops to 8 byte by default but try
to align it to 16 byte if we don't need to fill in over 10 bytes.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug other/58863] for loop not aligned at -O2 or -O3
2013-10-24 17:14 [Bug other/58863] New: for loop not aligned at -O2 or -O3 ali.baharev at gmail dot com
` (2 preceding siblings ...)
2013-10-24 17:33 ` pinskia at gcc dot gnu.org
@ 2013-10-24 17:37 ` ali.baharev at gmail dot com
2013-10-24 17:39 ` ali.baharev at gmail dot com
2013-10-25 10:19 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: ali.baharev at gmail dot com @ 2013-10-24 17:37 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863
--- Comment #4 from Ali Baharev <ali.baharev at gmail dot com> ---
My mistake, sorry.
So, you are saying that the default alignment is 8 byte for loops?
The funny thing is, this code runs 15% faster, if any of the followings are
passed:
-Os
-O2 -fno-align-loops -fno-align-functions
-O2 -fno-omit-frame-pointer
At least on my machine and in this case, 16 byte alignment is better (or any
multiple of 16 byte). -march=native has no effect on the performance.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug other/58863] for loop not aligned at -O2 or -O3
2013-10-24 17:14 [Bug other/58863] New: for loop not aligned at -O2 or -O3 ali.baharev at gmail dot com
` (3 preceding siblings ...)
2013-10-24 17:37 ` ali.baharev at gmail dot com
@ 2013-10-24 17:39 ` ali.baharev at gmail dot com
2013-10-25 10:19 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: ali.baharev at gmail dot com @ 2013-10-24 17:39 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863
--- Comment #5 from Ali Baharev <ali.baharev at gmail dot com> ---
OK, then 8 byte default alignment for loops is the default. If you think it is
not a bug, then let's close this. Sorry for the false alarm.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug other/58863] for loop not aligned at -O2 or -O3
2013-10-24 17:14 [Bug other/58863] New: for loop not aligned at -O2 or -O3 ali.baharev at gmail dot com
` (4 preceding siblings ...)
2013-10-24 17:39 ` ali.baharev at gmail dot com
@ 2013-10-25 10:19 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-10-25 10:19 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |INVALID
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
It works as designed.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-10-25 10:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-24 17:14 [Bug other/58863] New: for loop not aligned at -O2 or -O3 ali.baharev at gmail dot com
2013-10-24 17:23 ` [Bug other/58863] " pinskia at gcc dot gnu.org
2013-10-24 17:31 ` ali.baharev at gmail dot com
2013-10-24 17:33 ` pinskia at gcc dot gnu.org
2013-10-24 17:37 ` ali.baharev at gmail dot com
2013-10-24 17:39 ` ali.baharev at gmail dot com
2013-10-25 10:19 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).