public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/42216] [4.5 Regression] 464.h265ref peak regressed 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
@ 2009-11-29 13:39 ` rguenth at gcc dot gnu dot org
  2009-11-29 17:58 ` rth at gcc dot gnu dot org
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-11-29 13:39 UTC (permalink / raw)
  To: gcc-bugs



-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216]  New: [4.5 Regression] 464.h265ref peak regressed 20%
@ 2009-11-29 13:39 rguenth at gcc dot gnu dot org
  2009-11-29 13:39 ` [Bug tree-optimization/42216] " rguenth at gcc dot gnu dot org
                   ` (29 more replies)
  0 siblings, 30 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-11-29 13:39 UTC (permalink / raw)
  To: gcc-bugs

See http://gcc.opensuse.org/SPEC/CINT/sb-balakirew-head-64-2006/recent.html

First bad rev. is 154713, last good is 154686.

It's quite obvious that either the regrename.c changes have code generation
differences or that removing the vec_interleave_* expanders caused this
regression.

Richard - you removed all vec_interleave_* expanders but the vectorizer
still checks for optab_for_tree_code (VEC_INTERLEAVE_*_EXPR) which ends
up looking at vec_interleave_*_optab.  Either this code is all dead now
or you caused the vectorization no longer to apply.

I'll check if reverting your revision gets back performance.


-- 
           Summary: [4.5 Regression] 464.h265ref peak regressed 20%
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: rguenth at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] 464.h265ref peak regressed 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
  2009-11-29 13:39 ` [Bug tree-optimization/42216] " rguenth at gcc dot gnu dot org
@ 2009-11-29 17:58 ` rth at gcc dot gnu dot org
  2009-11-30  9:58 ` [Bug tree-optimization/42216] [4.5 Regression] 464.h264ref " rguenther at suse dot de
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rth at gcc dot gnu dot org @ 2009-11-29 17:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rth at gcc dot gnu dot org  2009-11-29 17:58 -------
The vec_interleave_*_optab should still be populated.  It's just
that what was once "sse2_punpcklwd" is now "vec_interleave_lowv8hi"
directly.  If this patch *is* attributable to a regression, then
perhaps there's a typo somewhere in the substitutions...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] 464.h264ref peak regressed 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
  2009-11-29 13:39 ` [Bug tree-optimization/42216] " rguenth at gcc dot gnu dot org
  2009-11-29 17:58 ` rth at gcc dot gnu dot org
@ 2009-11-30  9:58 ` rguenther at suse dot de
  2009-11-30 10:54 ` hjl dot tools at gmail dot com
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2009-11-30  9:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from rguenther at suse dot de  2009-11-30 09:58 -------
Subject: Re:  [4.5 Regression] 464.h265ref peak
 regressed 20%

On Sun, 29 Nov 2009, rth at gcc dot gnu dot org wrote:

> ------- Comment #1 from rth at gcc dot gnu dot org  2009-11-29 17:58 -------
> The vec_interleave_*_optab should still be populated.  It's just
> that what was once "sse2_punpcklwd" is now "vec_interleave_lowv8hi"
> directly.  If this patch *is* attributable to a regression, then
> perhaps there's a typo somewhere in the substitutions...

Indeed it wasn't your patch.  Checking with Bernds patch reverted now.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] 464.h264ref peak regressed 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2009-11-30  9:58 ` [Bug tree-optimization/42216] [4.5 Regression] 464.h264ref " rguenther at suse dot de
@ 2009-11-30 10:54 ` hjl dot tools at gmail dot com
  2009-12-01  9:58 ` [Bug tree-optimization/42216] [4.5 Regression] rev 15458[78] regress 464.h264ref peak 20% rguenth at gcc dot gnu dot org
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: hjl dot tools at gmail dot com @ 2009-11-30 10:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from hjl dot tools at gmail dot com  2009-11-30 10:54 -------
This may be related to revision 154688, which has caused PR 42202.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] rev 15458[78] regress 464.h264ref peak 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2009-11-30 10:54 ` hjl dot tools at gmail dot com
@ 2009-12-01  9:58 ` rguenth at gcc dot gnu dot org
  2009-12-01 15:27 ` bernds_cb1 at t-online dot de
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-01  9:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from rguenth at gcc dot gnu dot org  2009-12-01 09:58 -------
It was indeed Bernds patch.  Bernd - the patch discription suggested it was
a compile-time performance fix and not going to change code?  There is quite
big effect (including this negative one) on SPEC CPU 2006 on x86_64 ...


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bernds at gcc dot gnu dot
                   |                            |org
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-12-01 09:58:01
               date|                            |
            Summary|[4.5 Regression] 464.h264ref|[4.5 Regression] rev
                   |peak regressed 20%          |15458[78] regress
                   |                            |464.h264ref peak 20%


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] rev 15458[78] regress 464.h264ref peak 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2009-12-01  9:58 ` [Bug tree-optimization/42216] [4.5 Regression] rev 15458[78] regress 464.h264ref peak 20% rguenth at gcc dot gnu dot org
@ 2009-12-01 15:27 ` bernds_cb1 at t-online dot de
  2009-12-01 15:36 ` rguenther at suse dot de
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2009-12-01 15:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from bernds_cb1 at t-online dot de  2009-12-01 15:26 -------
Code generation changes are expected for two reasons - the code became less
conservative when determining conflicts with other registers, so we can usually
rename in more cases.  There are also a few cases where we have to give up on
renaming due to multiword hard register overlaps we can't track.

Unfortunately I don't have access to spec; I'll have to think about how to
track this down.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] rev 15458[78] regress 464.h264ref peak 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2009-12-01 15:27 ` bernds_cb1 at t-online dot de
@ 2009-12-01 15:36 ` rguenther at suse dot de
  2009-12-01 16:46 ` [Bug tree-optimization/42216] [4.5 Regression] rev 154688 " rguenth at gcc dot gnu dot org
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2009-12-01 15:36 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from rguenther at suse dot de  2009-12-01 15:35 -------
Subject: Re:  [4.5 Regression] rev 15458[78]
 regress 464.h264ref peak 20%

On Tue, 1 Dec 2009, bernds_cb1 at t-online dot de wrote:

> ------- Comment #5 from bernds_cb1 at t-online dot de  2009-12-01 15:26 -------
> Code generation changes are expected for two reasons - the code became less
> conservative when determining conflicts with other registers, so we can usually
> rename in more cases.  There are also a few cases where we have to give up on
> renaming due to multiword hard register overlaps we can't track.
> 
> Unfortunately I don't have access to spec; I'll have to think about how to
> track this down.

I will try to come up with a testcase for you.  I suppose it might
be a backend issue as regrename should mainly effect the 2nd
scheduling pass, right?

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] rev 154688 regress 464.h264ref peak 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2009-12-01 15:36 ` rguenther at suse dot de
@ 2009-12-01 16:46 ` rguenth at gcc dot gnu dot org
  2009-12-01 17:03 ` rguenth at gcc dot gnu dot org
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-01 16:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from rguenth at gcc dot gnu dot org  2009-12-01 16:46 -------
Just reverting rev. 154688 and using the training set gets us from

464.h264ref        --        228         -- S

to

464.h264ref        --        170         -- S                                  

at -O3 -ffast-math -funroll-loops -march-native (-march=k8-sse3 -msahf --param
l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512
-mtune=k8).

After the patch the oprofile looks like

CPU: AMD64 processors, speed 2000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 100000
samples  %        symbol name
2241755  51.1739  SetupFastFullPelSearch
442008   10.0900  BlockMotionSearch
271429    6.1961  SubPelBlockMotionSearch
269337    6.1483  FastPelY_14
170961    3.9026  UMVLine16Y_11
159311    3.6367  SetupLargerBlocks
155556    3.5510  SATD
127230    2.9044  FastLine16Y_11
72328     1.6511  dct_luma
69728     1.5917  getNonAffNeighbour

All but the *Y_1[14] functions are inside mv-search.c, just re-compiling
that file is enough to reproduce the issue.

after reverting it

CPU: AMD64 processors, speed 2000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 100000
samples  %        symbol name
1223269  36.8289  SetupFastFullPelSearch
450306   13.5573  BlockMotionSearch
320395    9.6461  SubPelBlockMotionSearch
245175    7.3815  FastPelY_14
160694    4.8380  SetupLargerBlocks
155353    4.6772  SATD
153292    4.6152  UMVLine16Y_11
79531     2.3944  FastLine16Y_11
72455     2.1814  dct_luma
70733     2.1296  getNonAffNeighbour
42828     1.2894  UMVPelY_14
34396     1.0356  UnifiedOneForthPix


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.5 Regression] rev        |[4.5 Regression] rev 154688
                   |15458[78] regress           |regress 464.h264ref peak 20%
                   |464.h264ref peak 20%        |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] rev 154688 regress 464.h264ref peak 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2009-12-01 16:46 ` [Bug tree-optimization/42216] [4.5 Regression] rev 154688 " rguenth at gcc dot gnu dot org
@ 2009-12-01 17:03 ` rguenth at gcc dot gnu dot org
  2009-12-01 17:26 ` bernds_cb1 at t-online dot de
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-01 17:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from rguenth at gcc dot gnu dot org  2009-12-01 17:03 -------
The hot loop is mv-search.c:SetupFastFullPelSearch

  for (pos = 0; pos < max_pos; pos++)
  {
    abs_y = offset_y + spiral_search_y[pos];
    abs_x = offset_x + spiral_search_x[pos];

    if (range_partly_outside)
    {
      if (abs_y >= 0 && abs_y <= max_height &&
          abs_x >= 0 && abs_x <= max_width    )
      {
        PelYline_11 = FastLine16Y_11;
      }
      else
      {
        PelYline_11 = UMVLine16Y_11;
      }
    }

    orgptr = orig_blocks;
    bindex = 0;
    for (blky = 0; blky < 4; blky++)
    {
      LineSadBlk0 = LineSadBlk1 = LineSadBlk2 = LineSadBlk3 = 0;
      for (y = 0; y < 4; y++)
      {
        refptr = PelYline_11 (ref_pic, abs_y++, abs_x, img_height, img_width);

        LineSadBlk0 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk0 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk0 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk0 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk1 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk1 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk1 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk1 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk2 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk2 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk2 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk2 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk3 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk3 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk3 += byte_abs [*refptr++ - *orgptr++];
        LineSadBlk3 += byte_abs [*refptr++ - *orgptr++];
      }
      block_sad[bindex++][pos] = LineSadBlk0;
      block_sad[bindex++][pos] = LineSadBlk1;
      block_sad[bindex++][pos] = LineSadBlk2;
      block_sad[bindex++][pos] = LineSadBlk3;
    }
  }

good assembly of the innermost loop:

.L1422:
        leal    1(%rsi), %r9d
        movl    64(%rsp), %r8d
        movl    68(%rsp), %ecx
        movl    52(%rsp), %edx
        movq    72(%rsp), %rdi
        movl    %r9d, 32(%rsp)
        call    *%rax
        movzwl  (%rbx), %ecx
        movzwl  (%rax), %edx
        movzwl  2(%rbx), %r10d
        movzwl  2(%rax), %r11d
        movq    byte_abs(%rip), %r9
        movzwl  4(%rbx), %esi
        subl    %ecx, %edx
        movzwl  4(%rax), %ecx
        movslq  %edx, %r8
        subl    %r10d, %r11d
        movzwl  6(%rax), %r10d
        addl    (%r9,%r8,4), %r14d
        movzwl  6(%rbx), %r8d
        movslq  %r11d, %rdi
        addl    (%r9,%rdi,4), %r14d
        movzwl  8(%rbx), %edi
        subl    %esi, %ecx
        movslq  %ecx, %rdx
        movzwl  8(%rax), %ecx
        subl    %r8d, %r10d
        addl    (%r9,%rdx,4), %r14d
        movzwl  10(%rax), %r8d
        movzwl  10(%rbx), %edx
        movslq  %r10d, %r11
        addl    (%r9,%r11,4), %r14d
        movzwl  12(%rbx), %r11d
        subl    %edi, %ecx
        movzwl  12(%rax), %edi
        movslq  %ecx, %rsi
        subl    %edx, %r8d
        addl    (%r9,%rsi,4), %r13d
        movzwl  14(%rax), %edx
        movzwl  14(%rbx), %esi
        movslq  %r8d, %r10
        subl    %r11d, %edi
        addl    (%r9,%r10,4), %r13d
        movzwl  16(%rax), %r11d
        movzwl  16(%rbx), %r10d
        movslq  %edi, %rcx
        addl    (%r9,%rcx,4), %r13d
        movzwl  18(%rbx), %ecx
        subl    %esi, %edx
        movslq  %edx, %r8
        movzwl  18(%rax), %edx
        subl    %r10d, %r11d
        addl    (%r9,%r8,4), %r13d
        movzwl  20(%rax), %r10d
        movzwl  20(%rbx), %r8d
        movslq  %r11d, %rdi
        addl    (%r9,%rdi,4), %ebp
        movzwl  22(%rbx), %edi
        subl    %ecx, %edx
        movzwl  22(%rax), %ecx
        movslq  %edx, %rsi
        subl    %r8d, %r10d
        addl    (%r9,%rsi,4), %ebp
        movzwl  24(%rax), %r8d
        movzwl  24(%rbx), %esi
        movslq  %r10d, %r11
        subl    %edi, %ecx
        addl    (%r9,%r11,4), %ebp
        movzwl  26(%rax), %edi
        movslq  %ecx, %rdx
        movzwl  26(%rbx), %r11d
        addl    (%r9,%rdx,4), %ebp
        movzwl  28(%rbx), %edx
        subl    %esi, %r8d
        movslq  %r8d, %r10
        movzwl  28(%rax), %r8d
        movzwl  30(%rax), %eax
        addl    (%r9,%r10,4), %r12d
        movzwl  30(%rbx), %r10d
        subl    %r11d, %edi
        movslq  %edi, %rcx
        addl    (%r9,%rcx,4), %r12d
        subl    %edx, %r8d
        subl    %r10d, %eax
        movslq  %r8d, %rsi
        addl    (%r9,%rsi,4), %r12d
        cltq
        addl    (%r9,%rax,4), %r12d
        addq    $32, %rbx
        addq    $32, %r15
        cmpq    $128, %r15
        je      .L1600
        movq    PelYline_11(%rip), %rax
        movl    32(%rsp), %esi
        jmp     .L1422

bad assembly:

.L1422:
        leal    1(%rsi), %edx
        movl    68(%rsp), %ecx
        movl    64(%rsp), %r8d
        movq    72(%rsp), %rdi
        movl    %edx, 32(%rsp)
        movl    52(%rsp), %edx
        call    *%rax
        movzwl  (%rbx), %esi
        movzwl  (%rax), %ecx
        movq    byte_abs(%rip), %rdx
        subl    %esi, %ecx
        movzwl  2(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r14d
        movzwl  2(%rax), %ecx
        subl    %esi, %ecx
        movzwl  4(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r14d
        movzwl  4(%rax), %ecx
        subl    %esi, %ecx
        movzwl  6(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r14d
        movzwl  6(%rax), %ecx
        subl    %esi, %ecx
        movzwl  8(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r14d
        movzwl  8(%rax), %ecx
        subl    %esi, %ecx
        movzwl  10(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r13d
        movzwl  10(%rax), %ecx
        subl    %esi, %ecx
        movzwl  12(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r13d
        movzwl  12(%rax), %ecx
        subl    %esi, %ecx
        movzwl  14(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r13d
        movzwl  14(%rax), %ecx
        subl    %esi, %ecx
        movzwl  16(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r13d
        movzwl  16(%rax), %ecx
        subl    %esi, %ecx
        movzwl  18(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %ebp
        movzwl  18(%rax), %ecx
        subl    %esi, %ecx
        movzwl  20(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %ebp
        movzwl  20(%rax), %ecx
        subl    %esi, %ecx
        movzwl  22(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %ebp
        movzwl  22(%rax), %ecx
        subl    %esi, %ecx
        movzwl  24(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %ebp
        movzwl  24(%rax), %ecx
        subl    %esi, %ecx
        movzwl  26(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r12d
        movzwl  26(%rax), %ecx
        subl    %esi, %ecx
        movzwl  28(%rbx), %esi
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r12d
        movzwl  28(%rax), %ecx
        movzwl  30(%rax), %eax
        subl    %esi, %ecx
        movslq  %ecx, %rcx
        addl    (%rdx,%rcx,4), %r12d
        movzwl  30(%rbx), %ecx
        subl    %ecx, %eax
        cltq
        addl    (%rdx,%rax,4), %r12d
        addq    $32, %rbx
        addq    $32, %r15
        cmpq    $128, %r15
        je      .L1600
        movq    PelYline_11(%rip), %rax
        movl    32(%rsp), %esi
        jmp     .L1422

seems to be really only scheduling differences...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] rev 154688 regress 464.h264ref peak 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (8 preceding siblings ...)
  2009-12-01 17:03 ` rguenth at gcc dot gnu dot org
@ 2009-12-01 17:26 ` bernds_cb1 at t-online dot de
  2009-12-01 17:27 ` bernds_cb1 at t-online dot de
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2009-12-01 17:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from bernds_cb1 at t-online dot de  2009-12-01 17:25 -------
Created an attachment (id=19202)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19202&action=view)
Test patch to attempt to narrow down the problem


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug tree-optimization/42216] [4.5 Regression] rev 154688 regress 464.h264ref peak 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (9 preceding siblings ...)
  2009-12-01 17:26 ` bernds_cb1 at t-online dot de
@ 2009-12-01 17:27 ` bernds_cb1 at t-online dot de
  2009-12-02 12:15 ` [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20% rguenth at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2009-12-01 17:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from bernds_cb1 at t-online dot de  2009-12-01 17:27 -------
Yes, regrename should only affect the second scheduling pass.

I'm attaching a stab-in-the-dark patch.  It contains a fix for the ia64
regressions (testing in progress), it adds a few warnings which could be useful
to spot whether the testcase is producing the corner cases we refuse to handle,
and it makes some changes to the register selection.  The idea is to prefer
registers which aren't used in the basic block yet, which should give the best
freedom to the scheduler.

I'd be glad if you could play with this and let me know the results.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (10 preceding siblings ...)
  2009-12-01 17:27 ` bernds_cb1 at t-online dot de
@ 2009-12-02 12:15 ` rguenth at gcc dot gnu dot org
  2009-12-02 13:16 ` rguenth at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-02 12:15 UTC (permalink / raw)
  To: gcc-bugs



-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (11 preceding siblings ...)
  2009-12-02 12:15 ` [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20% rguenth at gcc dot gnu dot org
@ 2009-12-02 13:16 ` rguenth at gcc dot gnu dot org
  2009-12-02 13:17 ` rguenth at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-02 13:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from rguenth at gcc dot gnu dot org  2009-12-02 13:16 -------
The patch didn't change performance.  For the relevant functions I see

mv-search.c:452:1: warning: failing block due to bad pattern

(the location seems to be always the end of the function though).

The insns are most of the time

(insn 333 1737 334 39 mv-search.c:542 (set (reg:DI 0 ax [orig:1365
block_c.pos_x ] [1365])
        (sign_extend:DI (reg:SI 0 ax))) 126 {extendsidi2_rex64} (nil))

or similar.

For the hot function with the regression I see:

(insn 287 2237 289 9 mv-search.c:327 (set (reg:DI 0 ax [803])
        (sign_extend:DI (reg:SI 0 ax [802]))) 126 {extendsidi2_rex64} (nil))

(insn 437 436 438 27 mv-search.c:305 (parallel [
            (set (reg:DI 5 di [853])
                (zero_extend:DI (plus:SI (reg:SI 5 di [orig:737 ivtmp.1996 ]
[737])
                        (const_int -1 [0xffffffffffffffff]))))
            (clobber (reg:CC 17 flags))
        ]) 253 {*addsi_1_zext} (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

(insn 663 662 664 55 mv-search.c:436 (set (reg:DI 0 ax [940])
        (sign_extend:DI (reg:SI 0 ax [939]))) 126 {extendsidi2_rex64} (nil))

(insn 719 2224 721 62 mv-search.c:327 (set (reg:DI 0 ax [948])
        (sign_extend:DI (reg:SI 0 ax [947]))) 126 {extendsidi2_rex64} (nil))

line 436 is inside the hot loop in question, I'll attach a dump of the
BB in question we skip due to it.  After the above insn only the following
extra insns appear:

(insn 664 663 665 55 mv-search.c:436 (parallel [
            (set (reg/v:SI 41 r12 [orig:394 LineSadBlk3 ] [394])
                (plus:SI (reg/v:SI 41 r12 [orig:394 LineSadBlk3 ] [394])
                    (mem:SI (plus:DI (mult:DI (reg:DI 0 ax [940])
                                (const_int 4 [0x4]))
                            (reg/f:DI 1 dx [orig:479 byte_abs.393 ] [479])) [2
S4 A32])))  
            (clobber (reg:CC 17 flags))
        ]) 251 {*addsi_1} (expr_list:REG_DEAD (reg/f:DI 1 dx [orig:479
byte_abs.393 ] [479])
        (expr_list:REG_DEAD (reg:DI 0 ax [940])
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))
(note 665 664 666 55 NOTE_INSN_DELETED)
(insn 666 665 667 55 mv-search.c:305 (parallel [
            (set (reg/v/f:DI 3 bx [orig:404 orgptr ] [404])
                (plus:DI (reg/v/f:DI 3 bx [orig:404 orgptr ] [404])
                    (const_int 32 [0x20])))
            (clobber (reg:CC 17 flags))
        ]) 252 {*adddi_1} (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 667 666 669 55 mv-search.c:305 (parallel [
            (set (reg:DI 44 r15 [orig:398 ivtmp.1894 ] [398])
                (plus:DI (reg:DI 44 r15 [orig:398 ivtmp.1894 ] [398])
                    (const_int 32 [0x20])))
            (clobber (reg:CC 17 flags))
        ]) 252 {*adddi_1} (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 669 667 670 55 mv-search.c:417 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:DI 44 r15 [orig:398 ivtmp.1894 ] [398])
            (const_int 128 [0x80]))) 7 {*cmpdi_1} (nil))
(jump_insn 670 669 671 55 mv-search.c:417 (set (pc)
        (if_then_else (ne (reg:CCZ 17 flags)
                (const_int 0 [0x0]))
            (label_ref 668)
            (pc))) 619 {*jcc_1} (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (expr_list:REG_BR_PROB (const_int 7500 [0x1d4c])
            (nil)))
 -> 668)
;; lr  out       3 [bx] 6 [bp] 7 [sp] 41 [r12] 42 [r13] 43 [r14] 44 [r15]
;; live  out     3 [bx] 6 [bp] 7 [sp] 41 [r12] 42 [r13] 43 [r14] 44 [r15]


They all look like they'd be handled just fine.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (12 preceding siblings ...)
  2009-12-02 13:16 ` rguenth at gcc dot gnu dot org
@ 2009-12-02 13:17 ` rguenth at gcc dot gnu dot org
  2009-12-02 16:28 ` bernds_cb1 at t-online dot de
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-02 13:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from rguenth at gcc dot gnu dot org  2009-12-02 13:16 -------
Created an attachment (id=19208)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19208&action=view)
basic-block that is rejected


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (13 preceding siblings ...)
  2009-12-02 13:17 ` rguenth at gcc dot gnu dot org
@ 2009-12-02 16:28 ` bernds_cb1 at t-online dot de
  2009-12-02 16:29 ` bernds_cb1 at t-online dot de
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2009-12-02 16:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from bernds_cb1 at t-online dot de  2009-12-02 16:28 -------
Created an attachment (id=19211)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19211&action=view)
Patch to make it less conservative when accepting matching constraints with
different modes


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (14 preceding siblings ...)
  2009-12-02 16:28 ` bernds_cb1 at t-online dot de
@ 2009-12-02 16:29 ` bernds_cb1 at t-online dot de
  2009-12-02 16:57 ` rguenth at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2009-12-02 16:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from bernds_cb1 at t-online dot de  2009-12-02 16:29 -------
Would you mind trying another patch, both for testing performance, and if that
fails, for interesting warnings?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (15 preceding siblings ...)
  2009-12-02 16:29 ` bernds_cb1 at t-online dot de
@ 2009-12-02 16:57 ` rguenth at gcc dot gnu dot org
  2009-12-02 17:06 ` bernds_cb1 at t-online dot de
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-02 16:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from rguenth at gcc dot gnu dot org  2009-12-02 16:57 -------
With the second patch there are no warnings when building 464.h264ref (I have
reverted the first patch before installing the 2nd, it didn't apply cleanly
otherwise).  Performance is back to normal with the patch applied.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (16 preceding siblings ...)
  2009-12-02 16:57 ` rguenth at gcc dot gnu dot org
@ 2009-12-02 17:06 ` bernds_cb1 at t-online dot de
  2009-12-02 17:07 ` rguenther at suse dot de
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2009-12-02 17:06 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from bernds_cb1 at t-online dot de  2009-12-02 17:06 -------
Thanks for testing.  This means the regression is fixed by the patch?  I'll do
a full test run then.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (17 preceding siblings ...)
  2009-12-02 17:06 ` bernds_cb1 at t-online dot de
@ 2009-12-02 17:07 ` rguenther at suse dot de
  2009-12-04 14:26 ` bernds_cb1 at t-online dot de
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2009-12-02 17:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from rguenther at suse dot de  2009-12-02 17:07 -------
Subject: Re:  [4.5 Regression] changes in scheduling
 regress 464.h264ref 20%

On Wed, 2 Dec 2009, bernds_cb1 at t-online dot de wrote:

> ------- Comment #16 from bernds_cb1 at t-online dot de  2009-12-02 17:06 -------
> Thanks for testing.  This means the regression is fixed by the patch?  I'll do
> a full test run then.

Yes, the regression is fixed by the second patch.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (18 preceding siblings ...)
  2009-12-02 17:07 ` rguenther at suse dot de
@ 2009-12-04 14:26 ` bernds_cb1 at t-online dot de
  2009-12-04 14:35 ` rguenther at suse dot de
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2009-12-04 14:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from bernds_cb1 at t-online dot de  2009-12-04 14:26 -------
Unfortunately it causes failures.  Tracking these mismatches really is quite
tricky.

I'll try to fix it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (19 preceding siblings ...)
  2009-12-04 14:26 ` bernds_cb1 at t-online dot de
@ 2009-12-04 14:35 ` rguenther at suse dot de
  2009-12-10 18:03 ` bernds at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2009-12-04 14:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from rguenther at suse dot de  2009-12-04 14:34 -------
Subject: Re:  [4.5 Regression] changes in scheduling
 regress 464.h264ref 20%

On Fri, 4 Dec 2009, bernds_cb1 at t-online dot de wrote:

> ------- Comment #18 from bernds_cb1 at t-online dot de  2009-12-04 14:26 -------
> Unfortunately it causes failures.  Tracking these mismatches really is quite
> tricky.
> 
> I'll try to fix it.

Thanks.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (20 preceding siblings ...)
  2009-12-04 14:35 ` rguenther at suse dot de
@ 2009-12-10 18:03 ` bernds at gcc dot gnu dot org
  2009-12-11 21:39 ` rguenth at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds at gcc dot gnu dot org @ 2009-12-10 18:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from bernds at gcc dot gnu dot org  2009-12-10 18:03 -------
Subject: Bug 42216

Author: bernds
Date: Thu Dec 10 18:03:05 2009
New Revision: 155134

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155134
Log:
        PR rtl-opt/42216
        * regrename.c: Error out if MAX_RECOG_OPERANDS is larger than
        HOST_BITS_PER_WIDE_INT.
        (verify_reg_in_set): New function, broken out of verify_reg_tracked.
        (verify_reg_tracked): Use it.
        (scan_rtx_reg): When seeing a use involving a superset of the
        registers in an existing chain, enlarge that chain.  Otherwise,
        allow subsets and set fail_current_block for all other kinds of
        overlap.
        (hide_operands): New argument UNTRACKED_OPERANDS; callers changed.
        Do not modify operands when the bit with the corresponding number
        is set in that bitmap.
        (build_def_use): When we see matching operands with different
        modes, don't set fail_current_block, but keep track of such
        operands in a bitmap if their registers aren't already tracked
        in an open chain.  Pass this bitmap to all hide_operands calls.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/regrename.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (21 preceding siblings ...)
  2009-12-10 18:03 ` bernds at gcc dot gnu dot org
@ 2009-12-11 21:39 ` rguenth at gcc dot gnu dot org
  2010-03-08 11:48 ` rguenth at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-11 21:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from rguenth at gcc dot gnu dot org  2009-12-11 21:39 -------
All is well again.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (22 preceding siblings ...)
  2009-12-11 21:39 ` rguenth at gcc dot gnu dot org
@ 2010-03-08 11:48 ` rguenth at gcc dot gnu dot org
  2010-03-08 23:04 ` bernds_cb1 at t-online dot de
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-03-08 11:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from rguenth at gcc dot gnu dot org  2010-03-08 11:48 -------
All is bad again after the fix for PR42220:

2010-03-07  Bernd Schmidt  <bernd.schmidt@analog.com>

        PR rtl-optimization/42220
        * regrename.c (scan_rtx) <case STRICT_LOW_PART, ZERO_EXTRACT>:
        Use verify_reg_tracked to determine if we should use OP_OUT rather
        than OP_INOUT.
        (build_def_use): If we see an in-out operand for a register that we
        know nothing about, treat is an output if possible, fail the block if
        not.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (23 preceding siblings ...)
  2010-03-08 11:48 ` rguenth at gcc dot gnu dot org
@ 2010-03-08 23:04 ` bernds_cb1 at t-online dot de
  2010-03-09  8:35 ` dominiq at lps dot ens dot fr
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bernds_cb1 at t-online dot de @ 2010-03-08 23:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from bernds_cb1 at t-online dot de  2010-03-08 23:04 -------
Created an attachment (id=20048)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20048&action=view)
Alternative fix for 42220

If you wouldn't mind, please test the attached patch which should be an
alternative fix for 42220 and avoids the need to set fail_block for some cases.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (24 preceding siblings ...)
  2010-03-08 23:04 ` bernds_cb1 at t-online dot de
@ 2010-03-09  8:35 ` dominiq at lps dot ens dot fr
  2010-03-09 10:31 ` rguenther at suse dot de
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-03-09  8:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from dominiq at lps dot ens dot fr  2010-03-09 08:35 -------
> If you wouldn't mind, please test the attached patch which should be an
> alternative fix for 42220 and avoids the need to set fail_block for some cases.

I cannot test if the patch in comment #23 fixes the regression for SPEC, but a
partial test indicates that it does not reintroduce pr42220 (patch applied on
top of r157293 with the patch from
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43287#c10 and a simple update).
More testing during the coming nights).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (25 preceding siblings ...)
  2010-03-09  8:35 ` dominiq at lps dot ens dot fr
@ 2010-03-09 10:31 ` rguenther at suse dot de
  2010-03-14  9:34 ` dominiq at lps dot ens dot fr
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2010-03-09 10:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #25 from rguenther at suse dot de  2010-03-09 10:31 -------
Subject: Re:  [4.5 Regression] changes in scheduling
 regress 464.h264ref 20%

On Mon, 8 Mar 2010, bernds_cb1 at t-online dot de wrote:

> ------- Comment #23 from bernds_cb1 at t-online dot de  2010-03-08 23:04 -------
> Created an attachment (id=20048)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20048&action=view)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20048&action=view)
> Alternative fix for 42220
> 
> If you wouldn't mind, please test the attached patch which should be an
> alternative fix for 42220 and avoids the need to set fail_block for some cases.

The patch fixes the regression for me.

Thanks,
Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (26 preceding siblings ...)
  2010-03-09 10:31 ` rguenther at suse dot de
@ 2010-03-14  9:34 ` dominiq at lps dot ens dot fr
  2010-03-17  9:26 ` bernds at gcc dot gnu dot org
  2010-03-18 17:59 ` rguenth at gcc dot gnu dot org
  29 siblings, 0 replies; 31+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-03-14  9:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #26 from dominiq at lps dot ens dot fr  2010-03-14 09:33 -------
I have posted a full regression testing with the patch in comment #23 at

http://gcc.gnu.org/ml/gcc-testresults/2010-03/msg01182.html

The failures are the usual ones and pr42220 is still fixed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (27 preceding siblings ...)
  2010-03-14  9:34 ` dominiq at lps dot ens dot fr
@ 2010-03-17  9:26 ` bernds at gcc dot gnu dot org
  2010-03-18 17:59 ` rguenth at gcc dot gnu dot org
  29 siblings, 0 replies; 31+ messages in thread
From: bernds at gcc dot gnu dot org @ 2010-03-17  9:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #27 from bernds at gcc dot gnu dot org  2010-03-17 09:25 -------
Subject: Bug 42216

Author: bernds
Date: Wed Mar 17 09:25:35 2010
New Revision: 157511

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=157511
Log:
        PR rtl-optimization/42216
        * regrename.c (create_new_chain): New function, broken out from...
        (scan_rtx_reg): ... here.  Call it.  Handle the case where we are
        appending a use to an empty chain.
        (build_def_use): Remove previous changes that convert OP_INOUT to
        OP_OUT operands; instead detect the case where an OP_INOUT operand
        uses a previously untracked register and create an empty chain for
        it.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/regrename.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20%
  2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
                   ` (28 preceding siblings ...)
  2010-03-17  9:26 ` bernds at gcc dot gnu dot org
@ 2010-03-18 17:59 ` rguenth at gcc dot gnu dot org
  29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-03-18 17:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #28 from rguenth at gcc dot gnu dot org  2010-03-18 17:59 -------
All fine again. Fixed.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2010-03-18 17:59 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-29 13:39 [Bug tree-optimization/42216] New: [4.5 Regression] 464.h265ref peak regressed 20% rguenth at gcc dot gnu dot org
2009-11-29 13:39 ` [Bug tree-optimization/42216] " rguenth at gcc dot gnu dot org
2009-11-29 17:58 ` rth at gcc dot gnu dot org
2009-11-30  9:58 ` [Bug tree-optimization/42216] [4.5 Regression] 464.h264ref " rguenther at suse dot de
2009-11-30 10:54 ` hjl dot tools at gmail dot com
2009-12-01  9:58 ` [Bug tree-optimization/42216] [4.5 Regression] rev 15458[78] regress 464.h264ref peak 20% rguenth at gcc dot gnu dot org
2009-12-01 15:27 ` bernds_cb1 at t-online dot de
2009-12-01 15:36 ` rguenther at suse dot de
2009-12-01 16:46 ` [Bug tree-optimization/42216] [4.5 Regression] rev 154688 " rguenth at gcc dot gnu dot org
2009-12-01 17:03 ` rguenth at gcc dot gnu dot org
2009-12-01 17:26 ` bernds_cb1 at t-online dot de
2009-12-01 17:27 ` bernds_cb1 at t-online dot de
2009-12-02 12:15 ` [Bug rtl-optimization/42216] [4.5 Regression] changes in scheduling regress 464.h264ref 20% rguenth at gcc dot gnu dot org
2009-12-02 13:16 ` rguenth at gcc dot gnu dot org
2009-12-02 13:17 ` rguenth at gcc dot gnu dot org
2009-12-02 16:28 ` bernds_cb1 at t-online dot de
2009-12-02 16:29 ` bernds_cb1 at t-online dot de
2009-12-02 16:57 ` rguenth at gcc dot gnu dot org
2009-12-02 17:06 ` bernds_cb1 at t-online dot de
2009-12-02 17:07 ` rguenther at suse dot de
2009-12-04 14:26 ` bernds_cb1 at t-online dot de
2009-12-04 14:35 ` rguenther at suse dot de
2009-12-10 18:03 ` bernds at gcc dot gnu dot org
2009-12-11 21:39 ` rguenth at gcc dot gnu dot org
2010-03-08 11:48 ` rguenth at gcc dot gnu dot org
2010-03-08 23:04 ` bernds_cb1 at t-online dot de
2010-03-09  8:35 ` dominiq at lps dot ens dot fr
2010-03-09 10:31 ` rguenther at suse dot de
2010-03-14  9:34 ` dominiq at lps dot ens dot fr
2010-03-17  9:26 ` bernds at gcc dot gnu dot org
2010-03-18 17:59 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).