[Bug target/23322] [4.1 regression] performance regression, possibly related to caching

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/23322] [4.1 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
@ 2005-10-31  4:49 ` mmitchel at gcc dot gnu dot org
  2006-02-24  0:30 ` [Bug target/23322] [4.1/4.2 " mmitchel at gcc dot gnu dot org
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-10-31  4:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from mmitchel at gcc dot gnu dot org  2005-10-31 04:49 -------
Do we have any analysis about why the register allocator is doing a worse job?

Leaving as P2.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
  2005-10-31  4:49 ` [Bug target/23322] [4.1 regression] performance regression, possibly related to caching mmitchel at gcc dot gnu dot org
@ 2006-02-24  0:30 ` mmitchel at gcc dot gnu dot org
  2006-05-25  2:34 ` mmitchel at gcc dot gnu dot org
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2006-02-24  0:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from mmitchel at gcc dot gnu dot org  2006-02-24 00:26 -------
This issue will not be resolved in GCC 4.1.0; retargeted at GCC 4.1.1.


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.1.0                       |4.1.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
  2005-10-31  4:49 ` [Bug target/23322] [4.1 regression] performance regression, possibly related to caching mmitchel at gcc dot gnu dot org
  2006-02-24  0:30 ` [Bug target/23322] [4.1/4.2 " mmitchel at gcc dot gnu dot org
@ 2006-05-25  2:34 ` mmitchel at gcc dot gnu dot org
  2006-08-27 18:46 ` pinskia at gcc dot gnu dot org
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2006-05-25  2:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from mmitchel at gcc dot gnu dot org  2006-05-25 02:33 -------
Will not be fixed in 4.1.1; adjust target milestone to 4.1.2.


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.1.1                       |4.1.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2006-05-25  2:34 ` mmitchel at gcc dot gnu dot org
@ 2006-08-27 18:46 ` pinskia at gcc dot gnu dot org
  2007-02-14  9:06 ` [Bug target/23322] [4.1/4.2/4.3 " mmitchel at gcc dot gnu dot org
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-08-27 18:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from pinskia at gcc dot gnu dot org  2006-08-27 18:45 -------
We get:
.L66:
        fldl    -40(%ebp)
        faddl   (%esi,%eax,8)
        addl    $1, %eax
        cmpl    %edx, %eax
        fstpl   -40(%ebp)
        jne     .L66

Now on the mainline


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2006-08-27 18:46 ` pinskia at gcc dot gnu dot org
@ 2007-02-14  9:06 ` mmitchel at gcc dot gnu dot org
  2007-12-13 14:10 ` ubizjak at gmail dot com
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2007-02-14  9:06 UTC (permalink / raw)
  To: gcc-bugs



-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.1.2                       |4.1.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2007-02-14  9:06 ` [Bug target/23322] [4.1/4.2/4.3 " mmitchel at gcc dot gnu dot org
@ 2007-12-13 14:10 ` ubizjak at gmail dot com
  2007-12-13 14:13 ` ubizjak at gmail dot com
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: ubizjak at gmail dot com @ 2007-12-13 14:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from ubizjak at gmail dot com  2007-12-13 14:10 -------
Reduced c++ testcase that is the cause of the runtime difference:

--cut here--
#include <iostream>

extern double *dpb;

void s000005m_test(void)
{
  double result = 0.0;

  for (int n = 0; n < 2000; ++n)
    result += dpb[n];

#ifdef FUBAR
    std::cerr << "Blah" << result << std::endl;
#else
    std::cerr << result << std::endl;
#endif
}
--cut here--

g++ -O2:

        ...
.LCFI8:
        movl    dpb, %edx       # dpb, dpb.68
        fldz
.L4:
        faddl   (%edx,%eax,8)   #* dpb.68
        addl    $1, %eax        #, n
        cmpl    $2000, %eax     #, n
        jne     .L4     #,
        fstpl   4(%esp) #
        movl    $_ZSt4cerr, (%esp)      #,
        call    _ZNSo9_M_insertIdEERSoT_        #
        ...

g++ -O2 -DFUBAR:

        ...
.LCFI8:
        movl    dpb, %edx       # dpb, dpb.68
        fldz
        fstpl   -288(%ebp)      # result
        .p2align 4,,7
        .p2align 3
.L4:
        fldl    -288(%ebp)      # result
        faddl   (%edx,%eax,8)   #* dpb.68
        addl    $1, %eax        #, n
        cmpl    $2000, %eax     #, n
        fstpl   -288(%ebp)      # result
        jne     .L4     #,
        movl    $4, 8(%esp)     #,
        movl    $.LC1, 4(%esp)  #,
        movl    $_ZSt4cerr, (%esp)      #,
        call   
_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_i  
#
        movl    $_ZSt4cerr, (%esp)      #,
        fldl    -288(%ebp)      # result
        fstpl   4(%esp) #
        call    _ZNSo9_M_insertIdEERSoT_        #
        ...

Please see what happens to "result" variable in -DFUBAR case.

Similar effect happens for -mfpmath=sse, but postreload gcse eliminates the
load (but not the store) from the loop (stack regs are not gcse'd after reload
by design). IMO, this is not target dependant, but pure RA problem.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2007-12-13 14:10 ` ubizjak at gmail dot com
@ 2007-12-13 14:13 ` ubizjak at gmail dot com
  2007-12-13 14:24 ` ubizjak at gmail dot com
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: ubizjak at gmail dot com @ 2007-12-13 14:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from ubizjak at gmail dot com  2007-12-13 14:12 -------
BTW: .p2align are removed manually from the first case for clarity, I have just
forgot to remove them in second case before posting.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2007-12-13 14:13 ` ubizjak at gmail dot com
@ 2007-12-13 14:24 ` ubizjak at gmail dot com
  2007-12-13 14:36 ` rguenth at gcc dot gnu dot org
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: ubizjak at gmail dot com @ 2007-12-13 14:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from ubizjak at gmail dot com  2007-12-13 14:24 -------
c testcase:

--cut here--
extern void foo(void);

extern double *dpb;

double s000005m_test(void)
{
  double result = 0.0;
  int n;

  for (n = 0; n < 2000; ++n)
    result += dpb[n];

#ifdef FOOBAR
  foo();
#endif
  return result;
}
--cut here--

This also shows problems on x86_64, so probably not target dependant:

.LCFI0:
        movq    dpb(%rip), %rdx
        xorl    %eax, %eax
        movsd   %xmm0, (%rsp)
        .align 16
.L2:
        movsd   (%rsp), %xmm0
        addsd   (%rdx,%rax,8), %xmm0
        addq    $1, %rax
        cmpq    $2000, %rax
        movsd   %xmm0, (%rsp)
        jne     .L2
        call    foo
        movsd   (%rsp), %xmm0
        addq    $8, %rsp
        ret


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2007-12-13 14:24 ` ubizjak at gmail dot com
@ 2007-12-13 14:36 ` rguenth at gcc dot gnu dot org
  2007-12-13 14:43 ` rguenth at gcc dot gnu dot org
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-12-13 14:36 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from rguenth at gcc dot gnu dot org  2007-12-13 14:36 -------
This is still a register allocation problem.  We somehow prefer xmm0 which is
call clobbered and causes reloads inside the loop.

Micha? :)


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2007-12-13 14:36 ` rguenth at gcc dot gnu dot org
@ 2007-12-13 14:43 ` rguenth at gcc dot gnu dot org
  2007-12-13 14:54 ` rguenth at gcc dot gnu dot org
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-12-13 14:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from rguenth at gcc dot gnu dot org  2007-12-13 14:43 -------
I guess if we would split the life-range of (reg:DF 64 [result]) to not extend
over the call, global wouldn't reload all of its uses.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2007-12-13 14:43 ` rguenth at gcc dot gnu dot org
@ 2007-12-13 14:54 ` rguenth at gcc dot gnu dot org
  2007-12-13 15:00 ` rguenth at gcc dot gnu dot org
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-12-13 14:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from rguenth at gcc dot gnu dot org  2007-12-13 14:54 -------
Does yara address this somehow?


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at redhat dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2007-12-13 14:54 ` rguenth at gcc dot gnu dot org
@ 2007-12-13 15:00 ` rguenth at gcc dot gnu dot org
  2007-12-13 15:42 ` [Bug target/23322] [4.1/4.2/4.3 regression] performance regression: global regalloc doesn't split live ranges ubizjak at gmail dot com
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-12-13 15:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from rguenth at gcc dot gnu dot org  2007-12-13 15:00 -------
"Works" with 2.95.4, fails at least starting with 3.3.6 (-m32).  Also happens
on x86_64, but there it's not a regression.  Happens on all targets that have
only call-clobbered registers that can hold 'result'.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 GCC target triplet|i686-*-*                    |i686-*-*, x86_64-*-*
      Known to fail|                            |3.3.6
      Known to work|                            |2.95.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression: global regalloc doesn't split live ranges
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2007-12-13 15:00 ` rguenth at gcc dot gnu dot org
@ 2007-12-13 15:42 ` ubizjak at gmail dot com
  2008-02-05 16:19 ` hubicka at gcc dot gnu dot org
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: ubizjak at gmail dot com @ 2007-12-13 15:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from ubizjak at gmail dot com  2007-12-13 15:41 -------
Just for fun, I have marked xmm12 - xmm15 as call-used for x86_64 and allocator
did the semi-right thing by using xmm12 for 'result'.


-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.1/4.2/4.3 regression]    |[4.1/4.2/4.3 regression]
                   |performance regression,     |performance regression:
                   |possibly related to caching |global regalloc doesn't
                   |                            |split live ranges


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression: global regalloc doesn't split live ranges
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2007-12-13 15:42 ` [Bug target/23322] [4.1/4.2/4.3 regression] performance regression: global regalloc doesn't split live ranges ubizjak at gmail dot com
@ 2008-02-05 16:19 ` hubicka at gcc dot gnu dot org
  2008-02-05 16:25 ` [Bug target/23322] [4.1/4.2/4.3 regression] performance regression hubicka at gcc dot gnu dot org
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-05 16:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from hubicka at gcc dot gnu dot org  2008-02-05 16:18 -------
The simplified testcase is dealt with the call crossed frequency patch.  I now
get:
.L2:
        faddl   (%edx,%eax,8)
        addl    $1, %eax
        cmpl    $2000, %eax
        jne     .L2
        fstpl   -24(%ebp)
        call    foo
        fldl    -24(%ebp)
        leave
        ret

With full testcase:
hubicka@occam:/aux/hubicka/trunk-write/buidl2/gcc$
/aux/hubicka/gcc-install/bin/g++ -O2 tt.cc --static
hubicka@occam:/aux/hubicka/trunk-write/buidl2/gcc$ time ./a.out

real    0m4.102s
user    0m4.092s
sys     0m0.008s

hubicka@occam:/aux/hubicka/trunk-write/buidl2/gcc$ g++-3.3 -O2 tt.cc --static
hubicka@occam:/aux/hubicka/trunk-write/buidl2/gcc$ time ./a.out

real    0m3.714s
user    0m3.708s
sys     0m0.000s

I don't have 2.95.3.  But we probably need to analyze what happent relatively
to 3.3 and 3.4 too


-- 

hubicka at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2005-08-11 14:56:31         |2008-02-05 16:18:23
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2008-02-05 16:19 ` hubicka at gcc dot gnu dot org
@ 2008-02-05 16:25 ` hubicka at gcc dot gnu dot org
  2008-02-05 18:26 ` ubizjak at gmail dot com
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-05 16:25 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from hubicka at gcc dot gnu dot org  2008-02-05 16:24 -------
RA still don't split live ranges, but works sanely here:
.L21:   
        faddl   (%ebx,%eax,8)
        addl    $1, %eax
        cmpl    %edx, %eax
        jl      .L21
Honza


-- 

hubicka at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.1/4.2/4.3 regression]    |[4.1/4.2/4.3 regression]
                   |performance regression:     |performance regression
                   |global regalloc doesn't     |
                   |split live ranges           |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2008-02-05 16:25 ` [Bug target/23322] [4.1/4.2/4.3 regression] performance regression hubicka at gcc dot gnu dot org
@ 2008-02-05 18:26 ` ubizjak at gmail dot com
  2008-02-05 22:51 ` hubicka at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: ubizjak at gmail dot com @ 2008-02-05 18:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from ubizjak at gmail dot com  2008-02-05 18:25 -------
There was a discussion on IRC some time ago, and it was suggested that there
was a LR-splitting patch in cygnus local tree. maybe someone would like to post
this patch on gcc-patches@ ML?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (15 preceding siblings ...)
  2008-02-05 18:26 ` ubizjak at gmail dot com
@ 2008-02-05 22:51 ` hubicka at gcc dot gnu dot org
  2008-02-05 23:54 ` hubicka at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-05 22:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from hubicka at gcc dot gnu dot org  2008-02-05 22:50 -------
Last rumors I heard about LR splitting was that it didn't really helped and
worked and used LOOP notes, so it would need complete rewrite anyway.  This
problem wasn't really LR splitting issue, just wrong caller save decision
heuristic.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (16 preceding siblings ...)
  2008-02-05 22:51 ` hubicka at gcc dot gnu dot org
@ 2008-02-05 23:54 ` hubicka at gcc dot gnu dot org
  2008-02-06  6:52 ` ubizjak at gmail dot com
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-05 23:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from hubicka at gcc dot gnu dot org  2008-02-05 23:54 -------
Looking at the -O2 and -O2 -fno-inline-small-functions, I believe last
remaining problem is our inability to hoist load of 0 out of loop:

The fill loop without inlining is taking the value as argument:
.L7:
        fstl    (%eax)
        addl    $8, %eax
        cmpl    %eax, %edx
        jne     .L7
        fstp    %st(0)

With inlining we however constant propagate that later results in load/store
pair:
.L8:
        flds    .LC0
        fstpl   (%eax)
        addl    $8, %eax
        cmpl    %eax, %edx
        jne     .L8

This is done intentionally:

    /* Hoisting constant pool constants into stack regs may cost more than
       just single register.  On x87, the balance is affected both by the
       small number of FP registers, and by its register stack organization,
       that forces us to add compensation code in and around the loop to
       shuffle the operands to the top of stack before use, and pop them
       from the stack after the loop finishes.

       To model this effect, we increase the number of registers needed for
       stack registers by two: one register push, and one register pop.
       This usually has the effect that FP constant loads from the constant
       pool are not moved out of the loop.

       Note that this also means that dependent invariants can not be moved.
       However, the primary purpose of this pass is to move loop invariant
       address arithmetic out of loops, and address arithmetic that depends
       on floating point constants is unlikely to ever occur.  */

Obviously this heuristic is misbehaving in such a simple cases where no other
registers are carried over loop.  One obvious problem is also that it is in
effect for SSE codegen too.  I am testing following patch that solves the
second problem:

Index: loop-invariant.c
===================================================================
*** loop-invariant.c    (revision 131965)
--- loop-invariant.c    (working copy)
*************** get_inv_cost (struct invariant *inv, int
*** 1012,1017 ****
--- 1012,1018 ----
      rtx set = single_set (inv->insn);
      if (set
         && IS_STACK_MODE (GET_MODE (SET_SRC (set)))
+        && (!TARGET_SSE_MATH || !SSE_FLOAT_MODE_P (GET_MODE (SET_SRC (set))))
         && constant_pool_constant_p (SET_SRC (set)))
        (*regs_needed) += 2;
    }

and cure this problem for -mfpmath=SSE at least. On 64bit target this now does
good job. For 32bit however we get another transformation:

.L8:
        movl    $0, (%eax)
        movl    $1074266112, 4(%eax)
        addl    $8, %eax
        cmpl    %eax, %edx
        jne     .L8
.L2:

At least on Athlon this slows down due to partial memory stall.
This can be fixed by following:

Index: config/i386/i386.md
===================================================================
*** config/i386/i386.md (revision 131965)
--- config/i386/i386.md (working copy)
***************
*** 2690,2704 ****
    [(set (match_operand:DF 0 "nonimmediate_operand"
                        "=f,m,f,*r  ,o  ,Y2*x,Y2*x,Y2*x ,m  ")
        (match_operand:DF 1 "general_operand"
!                       "fm,f,G,*roF,F*r,C   ,Y2*x,mY2*x,Y2*x"))]
    "!(MEM_P (operands[0]) && MEM_P (operands[1]))
     && ((optimize_size || !TARGET_INTEGER_DFMODE_MOVES) && !TARGET_64BIT)
     && (reload_in_progress || reload_completed
         || (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE)
         || (!(TARGET_SSE2 && TARGET_SSE_MATH) && optimize_size
           && standard_80387_constant_p (operands[1]))
         || GET_CODE (operands[1]) != CONST_DOUBLE
!        || memory_operand (operands[0], DFmode))"
  {
    switch (which_alternative)
      {
--- 2690,2708 ----
    [(set (match_operand:DF 0 "nonimmediate_operand"
                        "=f,m,f,*r  ,o  ,Y2*x,Y2*x,Y2*x ,m  ")
        (match_operand:DF 1 "general_operand"
!                       "fm,f,G,*roF,*Fr,C   ,Y2*x,mY2*x,Y2*x"))]
    "!(MEM_P (operands[0]) && MEM_P (operands[1]))
     && ((optimize_size || !TARGET_INTEGER_DFMODE_MOVES) && !TARGET_64BIT)
     && (reload_in_progress || reload_completed
         || (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE)
         || (!(TARGET_SSE2 && TARGET_SSE_MATH) && optimize_size
+            && !memory_operand (operands[0], DFmode)
           && standard_80387_constant_p (operands[1]))
         || GET_CODE (operands[1]) != CONST_DOUBLE
!        || ((optimize_size
!             || !TARGET_MEMORY_MISMATCH_STALL
!           || reload_in_progress || reload_completed)
!          && memory_operand (operands[0], DFmode)))"
  {
    switch (which_alternative)
      {

Now with SSE codegen or with the STACK_REGS heuristics bit commented out we get
better score with -O2 than with -O2 -fno-inline-small-functions.  I guess the
heuristic can be made more selective as currently I think it just disable all
the hoists that is just wrong.

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (17 preceding siblings ...)
  2008-02-05 23:54 ` hubicka at gcc dot gnu dot org
@ 2008-02-06  6:52 ` ubizjak at gmail dot com
  2008-02-06  9:01 ` steven at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: ubizjak at gmail dot com @ 2008-02-06  6:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from ubizjak at gmail dot com  2008-02-06 06:52 -------
(In reply to comment #21)

> Obviously this heuristic is misbehaving in such a simple cases where no other
> registers are carried over loop.  One obvious problem is also that it is in
> effect for SSE codegen too.  I am testing following patch that solves the
> second problem:
> 
> Index: loop-invariant.c
> ===================================================================
> *** loop-invariant.c    (revision 131965)
> --- loop-invariant.c    (working copy)
> *************** get_inv_cost (struct invariant *inv, int
> *** 1012,1017 ****
> --- 1012,1018 ----
>       rtx set = single_set (inv->insn);
>       if (set
>          && IS_STACK_MODE (GET_MODE (SET_SRC (set)))
> +        && (!TARGET_SSE_MATH || !SSE_FLOAT_MODE_P (GET_MODE (SET_SRC (set))))
>          && constant_pool_constant_p (SET_SRC (set)))
>         (*regs_needed) += 2;
>     }
> 
> and cure this problem for -mfpmath=SSE at least. On 64bit target this now does

But we _have_ following in i386.h:

#define IS_STACK_MODE(MODE)                                     \
  (((MODE) == SFmode && (!TARGET_SSE || !TARGET_SSE_MATH))      \
   || ((MODE) == DFmode && (!TARGET_SSE2 || !TARGET_SSE_MATH))  \
   || (MODE) == XFmode)

I need my morning coffee, the above logic is ATM a bit hard to decipher. ;)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (18 preceding siblings ...)
  2008-02-06  6:52 ` ubizjak at gmail dot com
@ 2008-02-06  9:01 ` steven at gcc dot gnu dot org
  2008-02-06 11:27 ` hubicka at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-02-06  9:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from steven at gcc dot gnu dot org  2008-02-06 09:00 -------
The IS_STACK_MODE trick is mine, and if this affects SSE code generation, the
bug is that IS_STACK_MODE returns true for registers that will not go on the
stack.

I acknowledge the IS_STACK_MODE is a big hammer, but we tested extensively
before committing it.  We have no easy way to know if any registers are carried
over the loop.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (19 preceding siblings ...)
  2008-02-06  9:01 ` steven at gcc dot gnu dot org
@ 2008-02-06 11:27 ` hubicka at gcc dot gnu dot org
  2008-02-06 11:31 ` hubicka at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-06 11:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from hubicka at gcc dot gnu dot org  2008-02-06 11:26 -------
IS_STACK_MODE returns true for MODEs that *might* go to the stack registers. 
When  we do SSE math, SFmode/DFmode will most likely go into SSE register, but
still they are valid for stack register and might go there in some corner cases
(for example when asked for by asm statement or they get returned in the
register via i386 ABI requirements).

I will today commit the two fixes and then we can give a try to make STACK_REGS
hack less active.

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (20 preceding siblings ...)
  2008-02-06 11:27 ` hubicka at gcc dot gnu dot org
@ 2008-02-06 11:31 ` hubicka at gcc dot gnu dot org
  2008-02-06 11:43 ` steven at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-06 11:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #25 from hubicka at gcc dot gnu dot org  2008-02-06 11:30 -------
Hmm, sorry. IS_STACK_MODE obviously return false for SSE math, per Paolo's
commit.  Since it is not used except for the hacks to avoid constant hoisting,
I guess it is OK.  I will commit only the patch for pertial memory write and
give it a try making this heuristic less active.

The code is question is simple loop initializing array with a constant.  This
seems like something we ought to get right


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (21 preceding siblings ...)
  2008-02-06 11:31 ` hubicka at gcc dot gnu dot org
@ 2008-02-06 11:43 ` steven at gcc dot gnu dot org
  2008-02-06 11:55 ` hubicka at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-02-06 11:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #26 from steven at gcc dot gnu dot org  2008-02-06 11:43 -------
You could read up on the following mailing list threads if you want to know
where the IS_STACK_REG check comes from:
http://gcc.gnu.org/ml/gcc-patches/2005-12/msg01859.html
http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01248.html
http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01584.html

You will notice that other RTL passes also have this guard in place.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (22 preceding siblings ...)
  2008-02-06 11:43 ` steven at gcc dot gnu dot org
@ 2008-02-06 11:55 ` hubicka at gcc dot gnu dot org
  2008-02-06 15:10 ` hubicka at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-06 11:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #27 from hubicka at gcc dot gnu dot org  2008-02-06 11:55 -------
I noticed those posts.  Part of the problem might be that hoisting triggers the
partial memory stall bug I fixed. Partial memory stalls are quite expensive, so
this might improve scores without the hack in some cases.  I will at least give
it a try on SPECs.

Also with DF merge we ought to have liveness readily available?  Perhaps we can
just strictly special case the initialization loop: count number of FP regs
touched by loop live across loop latch edge and if the constant is at most 1
allow one extra invariant to be lifted. Naturally this is as ugly as it can
get, prettier solution would be preferred...

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (23 preceding siblings ...)
  2008-02-06 11:55 ` hubicka at gcc dot gnu dot org
@ 2008-02-06 15:10 ` hubicka at gcc dot gnu dot org
  2008-02-08 14:55 ` hubicka at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-06 15:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #28 from hubicka at gcc dot gnu dot org  2008-02-06 15:10 -------
*** Bug 23305 has been marked as a duplicate of this bug. ***


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (24 preceding siblings ...)
  2008-02-06 15:10 ` hubicka at gcc dot gnu dot org
@ 2008-02-08 14:55 ` hubicka at gcc dot gnu dot org
  2008-02-11  9:24 ` hubicka at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-08 14:55 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #29 from hubicka at gcc dot gnu dot org  2008-02-08 14:54 -------
Hi,
tonight results from Haydn shows 32bit scores with the loop-invariant hack
disabled.
http://www.suse.de/~gcctest/SPEC/CFP/sb-haydn-head-64-32o-32bit/index.html

There are no off noise speedups though I must admit that the 32bit results from
Haydn are truly noisy. I also did runs by hand and those also don't seem much
difference, but the noise factor is always dificult to estimate in little stuff
like this.

I will give it a run on britten too to double check, it was running different
patch tonight.

However I wonder if you do have testcases where the hack helps. In general we
can

1) ignore the problem
2) disable the hack for loop-invariant only (not for GCSE). Loop-invariant is
more sure about benefits and already has some heuristics to limit the pressure.
 I do agree that in general having too many stuff in x87 registers kills the
perofmrance on very random basis.
3) come up with more cureful heuristics for loop-invariant. Perhaps based on
number of loop carried variables or numebr of FP registers live across loopback
edge.

For 3 we would need test also benchmarks that originally exposed the problem. 
I see it was tested on povray, it would be nice the the test was repeated (and
I can probably do that if no one volunteers ;)

Honza

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.1/4.2/4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (25 preceding siblings ...)
  2008-02-08 14:55 ` hubicka at gcc dot gnu dot org
@ 2008-02-11  9:24 ` hubicka at gcc dot gnu dot org
  2008-07-04 20:02 ` [Bug target/23322] [4.2/4.3/4.4 " jsm28 at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-11  9:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #30 from hubicka at gcc dot gnu dot org  2008-02-11 09:23 -------
On britten there is also no noticeable effect on SPECfp score.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.2/4.3/4.4 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (26 preceding siblings ...)
  2008-02-11  9:24 ` hubicka at gcc dot gnu dot org
@ 2008-07-04 20:02 ` jsm28 at gcc dot gnu dot org
  2009-02-03 16:16 ` bonzini at gnu dot org
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2008-07-04 20:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #31 from jsm28 at gcc dot gnu dot org  2008-07-04 20:01 -------
Closing 4.1 branch.


-- 

jsm28 at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.1/4.2/4.3/4.4 regression]|[4.2/4.3/4.4 regression]
                   |performance regression      |performance regression
   Target Milestone|4.1.3                       |4.2.5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.2/4.3/4.4 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (28 preceding siblings ...)
  2009-02-03 16:16 ` bonzini at gnu dot org
@ 2009-02-03 16:16 ` bonzini at gnu dot org
  2009-02-08 11:32 ` hubicka at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: bonzini at gnu dot org @ 2009-02-03 16:16 UTC (permalink / raw)
  To: gcc-bugs



-- 

bonzini at gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.2/4.3/4.4 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (27 preceding siblings ...)
  2008-07-04 20:02 ` [Bug target/23322] [4.2/4.3/4.4 " jsm28 at gcc dot gnu dot org
@ 2009-02-03 16:16 ` bonzini at gnu dot org
  2009-02-03 16:16 ` bonzini at gnu dot org
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: bonzini at gnu dot org @ 2009-02-03 16:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #32 from bonzini at gnu dot org  2009-02-03 16:16 -------
The patch for partial memory writes was committed.

How are we doing on this benchmark now?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.2/4.3/4.4 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (29 preceding siblings ...)
  2009-02-03 16:16 ` bonzini at gnu dot org
@ 2009-02-08 11:32 ` hubicka at gcc dot gnu dot org
  2009-02-12 13:43 ` ubizjak at gmail dot com
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2009-02-08 11:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #33 from hubicka at gcc dot gnu dot org  2009-02-08 11:32 -------
Partial memory issues are fixed, but I think related to register pressure
awareness of invariant motion we did not change much. Steven, what do you
think?
I can give it another run on 32bit tester. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.2/4.3/4.4 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (30 preceding siblings ...)
  2009-02-08 11:32 ` hubicka at gcc dot gnu dot org
@ 2009-02-12 13:43 ` ubizjak at gmail dot com
  2009-03-31 18:55 ` [Bug target/23322] [4.3/4.4/4.5 " jsm28 at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 35+ messages in thread
From: ubizjak at gmail dot com @ 2009-02-12 13:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #34 from ubizjak at gmail dot com  2009-02-12 13:43 -------
(In reply to comment #33)

> I can give it another run on 32bit tester. 

Yes, please.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.3/4.4/4.5 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (31 preceding siblings ...)
  2009-02-12 13:43 ` ubizjak at gmail dot com
@ 2009-03-31 18:55 ` jsm28 at gcc dot gnu dot org
  2009-04-16 16:22 ` [Bug target/23322] [4.3 " pinskia at gcc dot gnu dot org
  2009-04-22 15:11 ` rguenth at gcc dot gnu dot org
  34 siblings, 0 replies; 35+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2009-03-31 18:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #35 from jsm28 at gcc dot gnu dot org  2009-03-31 18:54 -------
Closing 4.2 branch.


-- 

jsm28 at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.2/4.3/4.4/4.5 regression]|[4.3/4.4/4.5 regression]
                   |performance regression      |performance regression
   Target Milestone|4.2.5                       |4.3.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (32 preceding siblings ...)
  2009-03-31 18:55 ` [Bug target/23322] [4.3/4.4/4.5 " jsm28 at gcc dot gnu dot org
@ 2009-04-16 16:22 ` pinskia at gcc dot gnu dot org
  2009-04-22 15:11 ` rguenth at gcc dot gnu dot org
  34 siblings, 0 replies; 35+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-04-16 16:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #36 from pinskia at gcc dot gnu dot org  2009-04-16 16:22 -------
Fixed via Ira so marking as such.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
      Known to work|2.95.4                      |2.95.4 4.4.0 4.5.0
   Last reconfirmed|2008-02-05 16:18:23         |2009-04-16 16:22:11
               date|                            |
            Summary|[4.3/4.4/4.5 regression]    |[4.3 regression] performance
                   |performance regression      |regression


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/23322] [4.3 regression] performance regression
       [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
                   ` (33 preceding siblings ...)
  2009-04-16 16:22 ` [Bug target/23322] [4.3 " pinskia at gcc dot gnu dot org
@ 2009-04-22 15:11 ` rguenth at gcc dot gnu dot org
  34 siblings, 0 replies; 35+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-04-22 15:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #37 from rguenth at gcc dot gnu dot org  2009-04-22 15:10 -------
WONTFIX on the 4.3 branch.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
      Known to fail|3.3.6                       |3.3.6 4.3.3
         Resolution|                            |FIXED
   Target Milestone|4.3.4                       |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2009-04-22 15:11 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-23322-10914@http.gcc.gnu.org/bugzilla/>
2005-10-31  4:49 ` [Bug target/23322] [4.1 regression] performance regression, possibly related to caching mmitchel at gcc dot gnu dot org
2006-02-24  0:30 ` [Bug target/23322] [4.1/4.2 " mmitchel at gcc dot gnu dot org
2006-05-25  2:34 ` mmitchel at gcc dot gnu dot org
2006-08-27 18:46 ` pinskia at gcc dot gnu dot org
2007-02-14  9:06 ` [Bug target/23322] [4.1/4.2/4.3 " mmitchel at gcc dot gnu dot org
2007-12-13 14:10 ` ubizjak at gmail dot com
2007-12-13 14:13 ` ubizjak at gmail dot com
2007-12-13 14:24 ` ubizjak at gmail dot com
2007-12-13 14:36 ` rguenth at gcc dot gnu dot org
2007-12-13 14:43 ` rguenth at gcc dot gnu dot org
2007-12-13 14:54 ` rguenth at gcc dot gnu dot org
2007-12-13 15:00 ` rguenth at gcc dot gnu dot org
2007-12-13 15:42 ` [Bug target/23322] [4.1/4.2/4.3 regression] performance regression: global regalloc doesn't split live ranges ubizjak at gmail dot com
2008-02-05 16:19 ` hubicka at gcc dot gnu dot org
2008-02-05 16:25 ` [Bug target/23322] [4.1/4.2/4.3 regression] performance regression hubicka at gcc dot gnu dot org
2008-02-05 18:26 ` ubizjak at gmail dot com
2008-02-05 22:51 ` hubicka at gcc dot gnu dot org
2008-02-05 23:54 ` hubicka at gcc dot gnu dot org
2008-02-06  6:52 ` ubizjak at gmail dot com
2008-02-06  9:01 ` steven at gcc dot gnu dot org
2008-02-06 11:27 ` hubicka at gcc dot gnu dot org
2008-02-06 11:31 ` hubicka at gcc dot gnu dot org
2008-02-06 11:43 ` steven at gcc dot gnu dot org
2008-02-06 11:55 ` hubicka at gcc dot gnu dot org
2008-02-06 15:10 ` hubicka at gcc dot gnu dot org
2008-02-08 14:55 ` hubicka at gcc dot gnu dot org
2008-02-11  9:24 ` hubicka at gcc dot gnu dot org
2008-07-04 20:02 ` [Bug target/23322] [4.2/4.3/4.4 " jsm28 at gcc dot gnu dot org
2009-02-03 16:16 ` bonzini at gnu dot org
2009-02-03 16:16 ` bonzini at gnu dot org
2009-02-08 11:32 ` hubicka at gcc dot gnu dot org
2009-02-12 13:43 ` ubizjak at gmail dot com
2009-03-31 18:55 ` [Bug target/23322] [4.3/4.4/4.5 " jsm28 at gcc dot gnu dot org
2009-04-16 16:22 ` [Bug target/23322] [4.3 " pinskia at gcc dot gnu dot org
2009-04-22 15:11 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).