public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization
@ 2005-05-12 7:37 jbucata at tulsaconnect dot com
2005-05-12 7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-05-12 7:37 UTC (permalink / raw)
To: gcc-bugs
Another follow-on bug similar to bug 21485. The profiled optimizer on both
3.4.3 and 4.0.0 makes code worse than without it. Run times go up by 50% or more.
Relevant flags are: -static -O3 -march=athlon-xp -fomit-frame-pointer
This runs a fixed number of iterations; time from the command line to compare.
This is probably an interesting bug on its own, or at least I hope so. However,
what I originally discovered with the real BYTEmark code was that 3.4.3's
profiled optimizer made the code better, but 4.0.0's made it worse. Somehow, in
the process of combining things into one file and removing redundant code,
3.4.3's profiled optimizer started pessimizing the code as well. Let me know if
I should attempt to reproduce that behavior again (though I might not be able to
come up with the single-preprocessed-file testcase for that).
--
Summary: BYTEmark bitmap test: Regression with Profiled
Optimization
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jbucata at tulsaconnect dot com
CC: gcc-bugs at gcc dot gnu dot org
GCC host triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
2005-05-12 7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
@ 2005-05-12 7:39 ` jbucata at tulsaconnect dot com
2005-06-25 13:18 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-05-12 7:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From jbucata at tulsaconnect dot com 2005-05-12 07:39 -------
Created an attachment (id=8869)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8869&action=view)
preprocessed test case
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
2005-05-12 7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
2005-05-12 7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
@ 2005-06-25 13:18 ` steven at gcc dot gnu dot org
2005-07-15 11:56 ` steven at gcc dot gnu dot org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-06-25 13:18 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |steven at gcc dot gnu dot
|dot org |org
Status|UNCONFIRMED |ASSIGNED
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2005-06-25 13:18:52
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
2005-05-12 7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
2005-05-12 7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
2005-06-25 13:18 ` steven at gcc dot gnu dot org
@ 2005-07-15 11:56 ` steven at gcc dot gnu dot org
2005-07-18 5:17 ` jbucata at tulsaconnect dot com
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-07-15 11:56 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-07-15 11:52 -------
Average (of three runs) user times:
(1) is -march=i686 -O3 -fomit-frame-pointer
(2) is -march=i686 -O3 -funroll-loops -fomit-frame-pointer
(3) is -march=i686 -O3 -funroll-loops -fomit-frame-pointer -fprofile-use
(1) user 0m6.949s
(2) user 0m8.565s
(3) user 0m8.671s
Note that -fprofile-generate and -fprofile-use automatically enable loop
unrolling as well. So it looks like this is a non-bug, you're just being
bitten by loop unrolling, which seems to be the cause of the slowdowns in
this case.
I did the same timings on an AMD64 box, and there the times for the three
different binaries were roughly the same.
Could you try to see if your timings are poor without profiling but with
loop unrolling enabled?
--
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |WAITING
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
2005-05-12 7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
` (2 preceding siblings ...)
2005-07-15 11:56 ` steven at gcc dot gnu dot org
@ 2005-07-18 5:17 ` jbucata at tulsaconnect dot com
2005-07-18 21:38 ` jbucata at tulsaconnect dot com
2005-07-19 12:10 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-07-18 5:17 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From jbucata at tulsaconnect dot com 2005-07-18 04:42 -------
For me, with -march=athlon-xp, -funroll-loops on 4.0.0 did indeed pessimize
slightly. However, -fprofile-{generate,use} pessimized more on top of that. So
there's still a problem with regard to the PO.
I tried it again with your -march=i686 and it went from 9.5s => 13.5s user with
plain -funroll-loops. The PO made it a tidge worse from there. IOW, consistent
with what you saw.
Further, I tried -funroll-loops w/o the PO on 3.3.5 and 3.4.3 and it improved
run times for both--moreso in 3.4.3 than in 3.3.5. So it looks like that's
another regression in 4.0.0, in -funroll-loops by itself!
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
2005-05-12 7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
` (3 preceding siblings ...)
2005-07-18 5:17 ` jbucata at tulsaconnect dot com
@ 2005-07-18 21:38 ` jbucata at tulsaconnect dot com
2005-07-19 12:10 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-07-18 21:38 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From jbucata at tulsaconnect dot com 2005-07-18 21:11 -------
FWIW, I'm trying 4.1.0 beta 20050716, and it does better than 4.0.1.
-funroll-loops still slows it down (about 0.5s vs without it), but without, 4.1
shaves about 1.5 seconds off user time vs 4.0.1 (about 9.5s => 8.0s, or 15%
savings). It even does better than 3.4.3.
So maybe it's forgivable if -funroll-loops hurts 4.1's performance a bit, given
how much its base performance has improved over its predecessors. (Maybe the
fact that -funroll-loops on 3.4.3 helped so much was that 3.4.3 simply generated
bad code without it.)
On 4.1, -fprofile-{generate,use} hurts another 0.1s on top of -funroll-loops.
This was all with: -O3 -march=athlon-xp -fomit-frame-pointer (forgot -static,
but in a quick test it didn't seem to help significantly)
So perhaps this isn't a big deal now after all, though I'd like to think that
the PO, a feature to enhance performance, wouldn't hurt code performance *at
all*, either directly or indirectly (by requiring -funroll-loops).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
2005-05-12 7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
` (4 preceding siblings ...)
2005-07-18 21:38 ` jbucata at tulsaconnect dot com
@ 2005-07-19 12:10 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-07-19 12:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-07-19 11:26 -------
Thanks for your feedback.
I agree that profile opts should never hurt the code quality. I am not sure
what the problem is in this particular case, I haven't looked at the actual
assembler output yet, or how the compiler's decisions change when it has the
profile available to it.
Time to parse .s files then...
--
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
Last reconfirmed|2005-06-25 13:18:52 |2005-07-19 11:26:24
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-07-19 11:26 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-05-12 7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
2005-05-12 7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
2005-06-25 13:18 ` steven at gcc dot gnu dot org
2005-07-15 11:56 ` steven at gcc dot gnu dot org
2005-07-18 5:17 ` jbucata at tulsaconnect dot com
2005-07-18 21:38 ` jbucata at tulsaconnect dot com
2005-07-19 12:10 ` steven at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).