public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization
@ 2005-05-12  7:37 jbucata at tulsaconnect dot com
  2005-05-12  7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-05-12  7:37 UTC (permalink / raw)
  To: gcc-bugs

Another follow-on bug similar to bug 21485.  The profiled optimizer on both
3.4.3 and 4.0.0 makes code worse than without it.  Run times go up by 50% or more.

Relevant flags are: -static -O3 -march=athlon-xp -fomit-frame-pointer

This runs a fixed number of iterations; time from the command line to compare.

This is probably an interesting bug on its own, or at least I hope so.  However,
what I originally discovered with the real BYTEmark code was that 3.4.3's
profiled optimizer made the code better, but 4.0.0's made it worse.  Somehow, in
the process of combining things into one file and removing redundant code,
3.4.3's profiled optimizer started pessimizing the code as well.  Let me know if
I should attempt to reproduce that behavior again (though I might not be able to
come up with the single-preprocessed-file testcase for that).

-- 
           Summary: BYTEmark bitmap test: Regression with Profiled
                    Optimization
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jbucata at tulsaconnect dot com
                CC: gcc-bugs at gcc dot gnu dot org
  GCC host triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
  2005-05-12  7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
@ 2005-05-12  7:39 ` jbucata at tulsaconnect dot com
  2005-06-25 13:18 ` steven at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-05-12  7:39 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From jbucata at tulsaconnect dot com  2005-05-12 07:39 -------
Created an attachment (id=8869)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8869&action=view)
preprocessed test case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
  2005-05-12  7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
  2005-05-12  7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
@ 2005-06-25 13:18 ` steven at gcc dot gnu dot org
  2005-07-15 11:56 ` steven at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-06-25 13:18 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |steven at gcc dot gnu dot
                   |dot org                     |org
             Status|UNCONFIRMED                 |ASSIGNED
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2005-06-25 13:18:52
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
  2005-05-12  7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
  2005-05-12  7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
  2005-06-25 13:18 ` steven at gcc dot gnu dot org
@ 2005-07-15 11:56 ` steven at gcc dot gnu dot org
  2005-07-18  5:17 ` jbucata at tulsaconnect dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-07-15 11:56 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-07-15 11:52 -------
Average (of three runs) user times: 
 
(1) is -march=i686 -O3 -fomit-frame-pointer 
(2) is -march=i686 -O3 -funroll-loops -fomit-frame-pointer 
(3) is -march=i686 -O3 -funroll-loops -fomit-frame-pointer -fprofile-use 
 
(1)  user    0m6.949s 
(2)  user    0m8.565s 
(3)  user    0m8.671s 
 
Note that -fprofile-generate and -fprofile-use automatically enable loop 
unrolling as well.  So it looks like this is a non-bug, you're just being 
bitten by loop unrolling, which seems to be the cause of the slowdowns in 
this case. 
 
I did the same timings on an AMD64 box, and there the times for the three 
different binaries were roughly the same. 
 
Could you try to see if your timings are poor without profiling but with 
loop unrolling enabled? 
 

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
  2005-05-12  7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
                   ` (2 preceding siblings ...)
  2005-07-15 11:56 ` steven at gcc dot gnu dot org
@ 2005-07-18  5:17 ` jbucata at tulsaconnect dot com
  2005-07-18 21:38 ` jbucata at tulsaconnect dot com
  2005-07-19 12:10 ` steven at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-07-18  5:17 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From jbucata at tulsaconnect dot com  2005-07-18 04:42 -------
For me, with -march=athlon-xp, -funroll-loops on 4.0.0 did indeed pessimize
slightly.  However, -fprofile-{generate,use} pessimized more on top of that.  So
there's still a problem with regard to the PO.

I tried it again with your -march=i686 and it went from 9.5s => 13.5s user with
plain -funroll-loops.  The PO made it a tidge worse from there.  IOW, consistent
with what you saw.

Further, I tried -funroll-loops w/o the PO on 3.3.5 and 3.4.3 and it improved
run times for both--moreso in 3.4.3 than in 3.3.5.  So it looks like that's
another regression in 4.0.0, in -funroll-loops by itself!

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
  2005-05-12  7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
                   ` (3 preceding siblings ...)
  2005-07-18  5:17 ` jbucata at tulsaconnect dot com
@ 2005-07-18 21:38 ` jbucata at tulsaconnect dot com
  2005-07-19 12:10 ` steven at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-07-18 21:38 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From jbucata at tulsaconnect dot com  2005-07-18 21:11 -------
FWIW, I'm trying 4.1.0 beta 20050716, and it does better than 4.0.1. 
-funroll-loops still slows it down (about 0.5s vs without it), but without, 4.1
shaves about 1.5 seconds off user time vs 4.0.1 (about 9.5s => 8.0s, or 15%
savings).  It even does better than 3.4.3.

So maybe it's forgivable if -funroll-loops hurts 4.1's performance a bit, given
how much its base performance has improved over its predecessors.  (Maybe the
fact that -funroll-loops on 3.4.3 helped so much was that 3.4.3 simply generated
bad code without it.)

On 4.1, -fprofile-{generate,use} hurts another 0.1s on top of -funroll-loops.

This was all with: -O3 -march=athlon-xp -fomit-frame-pointer (forgot -static,
but in a quick test it didn't seem to help significantly)

So perhaps this isn't a big deal now after all, though I'd like to think that
the PO, a feature to enhance performance, wouldn't hurt code performance *at
all*, either directly or indirectly (by requiring -funroll-loops).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/21527] BYTEmark bitmap test: Regression with Profiled Optimization
  2005-05-12  7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
                   ` (4 preceding siblings ...)
  2005-07-18 21:38 ` jbucata at tulsaconnect dot com
@ 2005-07-19 12:10 ` steven at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-07-19 12:10 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-07-19 11:26 -------
Thanks for your feedback. 
 
I agree that profile opts should never hurt the code quality.  I am not sure 
what the problem is in this particular case, I haven't looked at the actual 
assembler output yet, or how the compiler's decisions change when it has the 
profile available to it. 
 
Time to parse .s files then... 

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
   Last reconfirmed|2005-06-25 13:18:52         |2005-07-19 11:26:24
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-07-19 11:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-05-12  7:37 [Bug rtl-optimization/21527] New: BYTEmark bitmap test: Regression with Profiled Optimization jbucata at tulsaconnect dot com
2005-05-12  7:39 ` [Bug rtl-optimization/21527] " jbucata at tulsaconnect dot com
2005-06-25 13:18 ` steven at gcc dot gnu dot org
2005-07-15 11:56 ` steven at gcc dot gnu dot org
2005-07-18  5:17 ` jbucata at tulsaconnect dot com
2005-07-18 21:38 ` jbucata at tulsaconnect dot com
2005-07-19 12:10 ` steven at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).