public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
@ 2005-05-10  9:03 jbucata at tulsaconnect dot com
  2005-05-10  9:05 ` [Bug rtl-optimization/21485] " jbucata at tulsaconnect dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 16+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-05-10  9:03 UTC (permalink / raw)
  To: gcc-bugs

I've found a major performance regression in gcc 4.0.0's optimization of the
BYTEmark numsort benchmark.  I've boiled it down to a testcase that I think will
suit you... it outputs a single number representing the number of iterations run
(higher is better).  On my machine I get 900ish under 4.0.0 and around 1530 on
3.4.3.

Both were compiled and run in a Gentoo test partition, if that makes a difference:
3.4.3: gcc version 3.4.3-20050110 (Gentoo Linux 3.4.3.20050110-r2,
ssp-3.4.3.20050110-0, pie-8.7.7)
4.0.0: gcc version 4.0.0 (Gentoo Linux 4.0.0)

-- 
           Summary: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0
                    with -O3 optimization
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jbucata at tulsaconnect dot com
                CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
@ 2005-05-10  9:05 ` jbucata at tulsaconnect dot com
  2005-05-10  9:10 ` jbucata at tulsaconnect dot com
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-05-10  9:05 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From jbucata at tulsaconnect dot com  2005-05-10 09:05 -------
Created an attachment (id=8851)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8851&action=view)
Test case (preprocessed)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
  2005-05-10  9:05 ` [Bug rtl-optimization/21485] " jbucata at tulsaconnect dot com
@ 2005-05-10  9:10 ` jbucata at tulsaconnect dot com
  2005-05-10  9:15 ` steven at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: jbucata at tulsaconnect dot com @ 2005-05-10  9:10 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From jbucata at tulsaconnect dot com  2005-05-10 09:10 -------
Oops, I should add that my pertinent options were: -O3 -fomit-frame-pointer
-march=athlon-xp -static


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
  2005-05-10  9:05 ` [Bug rtl-optimization/21485] " jbucata at tulsaconnect dot com
  2005-05-10  9:10 ` jbucata at tulsaconnect dot com
@ 2005-05-10  9:15 ` steven at gcc dot gnu dot org
  2005-05-10  9:27 ` rguenth at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-05-10  9:15 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-05-10 09:14 -------
Confirmed on x86 (with and without frame pointer) and on amd64. 

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2005-05-10 09:14:51
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (2 preceding siblings ...)
  2005-05-10  9:15 ` steven at gcc dot gnu dot org
@ 2005-05-10  9:27 ` rguenth at gcc dot gnu dot org
  2005-05-10  9:45 ` giovannibajo at libero dot it
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-05-10  9:27 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rguenth at gcc dot gnu dot org  2005-05-10 09:27 -------
mainline drops even lower - looks like poor choice of addressing modes and thus
more register pressure for 4.0 and 4.1.  Note that using profile-feedback
improves numbers a lot (but still we regress).

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (3 preceding siblings ...)
  2005-05-10  9:27 ` rguenth at gcc dot gnu dot org
@ 2005-05-10  9:45 ` giovannibajo at libero dot it
  2005-05-10  9:46 ` steven at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: giovannibajo at libero dot it @ 2005-05-10  9:45 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From giovannibajo at libero dot it  2005-05-10 09:45 -------
Jason: thanks for this! Even better would be to let the testcase do a fixed 
number of iterations (like 1000 or so), and then we'll be using "time" 
externally to measure performance. Maybe you can do this for other testcases 
you are going to submit, thanks!

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (4 preceding siblings ...)
  2005-05-10  9:45 ` giovannibajo at libero dot it
@ 2005-05-10  9:46 ` steven at gcc dot gnu dot org
  2005-05-10  9:51 ` steven at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-05-10  9:46 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-05-10 09:46 -------
This is the function (reindented) where we spend almost all of our time: 
 
void 
NumSift (long *array, unsigned long i, unsigned long j) 
{ 
  unsigned long k; 
  long temp; 
  while ((i + i) <= j) 
    { 
      k = i + i; 
      if (k < j) 
        if (array[k] < array[k + 1L]) 
          ++k; 
      if (array[i] < array[k]) 
        { 
          temp = array[k]; 
          array[k] = array[i]; 
          array[i] = temp; 
          i = k; 
        } 
      else 
        i = j + 1; 
    } 
  return; 
} 
 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (5 preceding siblings ...)
  2005-05-10  9:46 ` steven at gcc dot gnu dot org
@ 2005-05-10  9:51 ` steven at gcc dot gnu dot org
  2005-05-10 10:22 ` steven at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-05-10  9:51 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-05-10 09:50 -------
If Richard is right in comment #4, it would be interesting to see what 
happens if one tries this with Zdenek's TARGET_MEM_REF patch. 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (6 preceding siblings ...)
  2005-05-10  9:51 ` steven at gcc dot gnu dot org
@ 2005-05-10 10:22 ` steven at gcc dot gnu dot org
  2005-05-10 12:58 ` bonzini at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-05-10 10:22 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-05-10 10:22 -------
On AMD64 with GCC 4.0.1 (CVS 4.0 branch) I go from ~580 at -O3 
to ~930 at -O3 -fno-tree-pre. 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (7 preceding siblings ...)
  2005-05-10 10:22 ` steven at gcc dot gnu dot org
@ 2005-05-10 12:58 ` bonzini at gcc dot gnu dot org
  2005-05-10 16:38 ` [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3 giovannibajo at libero dot it
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: bonzini at gcc dot gnu dot org @ 2005-05-10 12:58 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From bonzini at gcc dot gnu dot org  2005-05-10 12:58 -------
Looks like a register pressure problem... but yes, TARGET_MEM_REF may help as well.

Paolo

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (8 preceding siblings ...)
  2005-05-10 12:58 ` bonzini at gcc dot gnu dot org
@ 2005-05-10 16:38 ` giovannibajo at libero dot it
  2005-05-10 20:03 ` pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: giovannibajo at libero dot it @ 2005-05-10 16:38 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
      Known to fail|                            |4.0.0 4.1.0
      Known to work|                            |3.4.3
            Summary|BYTEmark numsort:           |[4.0/4.1 Regression]
                   |performance regression 3.4.3|BYTEmark numsort: codegen
                   |-> 4.0.0 with -O3           |regression with -O3
                   |optimization                |
   Target Milestone|---                         |4.0.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (9 preceding siblings ...)
  2005-05-10 16:38 ` [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3 giovannibajo at libero dot it
@ 2005-05-10 20:03 ` pinskia at gcc dot gnu dot org
  2005-07-08  1:41 ` mmitchel at gcc dot gnu dot org
  2005-09-27 16:21 ` mmitchel at gcc dot gnu dot org
  12 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-05-10 20:03 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-05-10 20:03 -------
IV-OPTS does nothing to this testcase, it does not even change the trees.  This is just a ra issue.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ra


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (10 preceding siblings ...)
  2005-05-10 20:03 ` pinskia at gcc dot gnu dot org
@ 2005-07-08  1:41 ` mmitchel at gcc dot gnu dot org
  2005-09-27 16:21 ` mmitchel at gcc dot gnu dot org
  12 siblings, 0 replies; 16+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-07-08  1:41 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.0.1                       |4.0.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3
  2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
                   ` (11 preceding siblings ...)
  2005-07-08  1:41 ` mmitchel at gcc dot gnu dot org
@ 2005-09-27 16:21 ` mmitchel at gcc dot gnu dot org
  12 siblings, 0 replies; 16+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-09-27 16:21 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.0.2                       |4.0.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3
       [not found] <bug-21485-10607@http.gcc.gnu.org/bugzilla/>
  2005-10-31  3:21 ` mmitchel at gcc dot gnu dot org
@ 2005-10-31  3:37 ` pinskia at gcc dot gnu dot org
  1 sibling, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-10-31  3:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from pinskia at gcc dot gnu dot org  2005-10-31 03:37 -------
(In reply to comment #11)
> So, we're doing a worse job on register allocation.  Is that because the
> register allocator got worse, or because we're giving it a harder problem to
> solve?  If the latter, what's responsible, and is there anything we can do
> about it?
It is the latter and the pass which is responsible is tree PRE but if someone
writes the code like what tree PRE gives, we will still have the same issue so
the only correct place to fix this would be in RA.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3
       [not found] <bug-21485-10607@http.gcc.gnu.org/bugzilla/>
@ 2005-10-31  3:21 ` mmitchel at gcc dot gnu dot org
  2005-10-31  3:37 ` pinskia at gcc dot gnu dot org
  1 sibling, 0 replies; 16+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-10-31  3:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from mmitchel at gcc dot gnu dot org  2005-10-31 03:21 -------
We need more analysis on these kinds of issues.

So, we're doing a worse job on register allocation.  Is that because the
register allocator got worse, or because we're giving it a harder problem to
solve?  If the latter, what's responsible, and is there anything we can do
about it?  We need that kind of information to make an intelligent decision
about whether or not we should try to fix this, or just chalk it up to the fact
that there's always variability between releases.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2005-10-31  3:37 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-05-10  9:03 [Bug rtl-optimization/21485] New: BYTEmark numsort: performance regression 3.4.3 -> 4.0.0 with -O3 optimization jbucata at tulsaconnect dot com
2005-05-10  9:05 ` [Bug rtl-optimization/21485] " jbucata at tulsaconnect dot com
2005-05-10  9:10 ` jbucata at tulsaconnect dot com
2005-05-10  9:15 ` steven at gcc dot gnu dot org
2005-05-10  9:27 ` rguenth at gcc dot gnu dot org
2005-05-10  9:45 ` giovannibajo at libero dot it
2005-05-10  9:46 ` steven at gcc dot gnu dot org
2005-05-10  9:51 ` steven at gcc dot gnu dot org
2005-05-10 10:22 ` steven at gcc dot gnu dot org
2005-05-10 12:58 ` bonzini at gcc dot gnu dot org
2005-05-10 16:38 ` [Bug rtl-optimization/21485] [4.0/4.1 Regression] BYTEmark numsort: codegen regression with -O3 giovannibajo at libero dot it
2005-05-10 20:03 ` pinskia at gcc dot gnu dot org
2005-07-08  1:41 ` mmitchel at gcc dot gnu dot org
2005-09-27 16:21 ` mmitchel at gcc dot gnu dot org
     [not found] <bug-21485-10607@http.gcc.gnu.org/bugzilla/>
2005-10-31  3:21 ` mmitchel at gcc dot gnu dot org
2005-10-31  3:37 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).