public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom.
@ 2013-01-30 15:18 ysrumyan at gmail dot com
  2013-01-30 15:20 ` [Bug tree-optimization/56151] " ysrumyan at gmail dot com
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-01-30 15:18 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

             Bug #: 56151
           Summary: Performance degradation after r194054 on x86 Atom.
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: ysrumyan@gmail.com


For simple test-case extracted from one important benchmark we see number
instrctions increasing after fix r194054. If we compile test-case with compiler
built before this revision we have only 6 instructions for store stmt:

    movl    %edx, %ebx
    movl    %edx, %ebp
    sarl    $5, %ebx
    andl    $31, %ebp
    movl    setmask(,%ebp,4), %ebp
    orl    %ebp, (%edi,%ebx,4)

but after this fix we got 9 instrctions for it:

    movl    %edx, %ebx
    movl    56(%esp), %esi
    sarl    $5, %ebx
    movl    %edx, %ebp
    andl    $31, %ebp
    leal    (%esi,%ebx,4), %esi
    movl    (%esi), %ebx
    orl    setmask(,%ebp,4), %ebx
    movl    %ebx, (%esi)

To reproduce it is sufficient to compile with the following options:

-O2 -ffast-math -msse2 -mfpmath=sse -m32 -march=atom -mtune=atom


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56151] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
@ 2013-01-30 15:20 ` ysrumyan at gmail dot com
  2013-01-30 16:04 ` [Bug rtl-optimization/56151] " rguenth at gcc dot gnu.org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-01-30 15:20 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

--- Comment #1 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-01-30 15:20:07 UTC ---
Created attachment 29306
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29306
testcase to reproduce


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
  2013-01-30 15:20 ` [Bug tree-optimization/56151] " ysrumyan at gmail dot com
@ 2013-01-30 16:04 ` rguenth at gcc dot gnu.org
  2013-01-31  9:24 ` [Bug rtl-optimization/56151] [4.8 Regression] " rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-01-30 16:04 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steven at gcc dot gnu.org
          Component|tree-optimization           |rtl-optimization

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> 2013-01-30 16:04:03 UTC ---
Steven, that's

 2012-12-02  Steven Bosscher  <steven@gcc.gnu.org>

+       * optabs.c (add_equal_note): Do not create self-referencing REG_EQUAL
+       notes.
+       * fwprop.c (forward_propagate_and_simplify): Likewise.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
  2013-01-30 15:20 ` [Bug tree-optimization/56151] " ysrumyan at gmail dot com
  2013-01-30 16:04 ` [Bug rtl-optimization/56151] " rguenth at gcc dot gnu.org
@ 2013-01-31  9:24 ` rguenth at gcc dot gnu.org
  2013-02-03  0:46 ` steven at gcc dot gnu.org
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-01-31  9:24 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |i?86-*-*
   Target Milestone|---                         |4.8.0
            Summary|Performance degradation     |[4.8 Regression]
                   |after r194054 on x86 Atom.  |Performance degradation
                   |                            |after r194054 on x86 Atom.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (2 preceding siblings ...)
  2013-01-31  9:24 ` [Bug rtl-optimization/56151] [4.8 Regression] " rguenth at gcc dot gnu.org
@ 2013-02-03  0:46 ` steven at gcc dot gnu.org
  2013-02-03  1:05 ` steven at gcc dot gnu.org
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu.org @ 2013-02-03  0:46 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2013-02-03
                 CC|steven at gcc dot gnu.org   |
     Ever Confirmed|0                           |1

--- Comment #3 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-03 00:45:37 UTC ---
Confirmed.

My patch only avoids situations that ought not to happen in the first
place: self-referencing REG_EQUAL notes.  If this somehow prevents an
interesting code transformation I don't consider it my responsibility
to find out what that might be. Therefore I'm not taking this bug.

I did look at the difference in generated code. The regression is due
to the optabs.c change, reverting it results in the pre-r194054 code
again. The difference in code from the 'expand' pass is:

    86: NOTE_INSN_BASIC_BLOCK 13
    87: {r130:SI=r60:SI>>0x5;clobber flags:CC;}
    88: {r131:SI=r130:SI<<0x2;clobber flags:CC;}
    89: {r86:SI=r103:SI+r131:SI;clobber flags:CC;}
    90: {r132:SI=r60:SI&0x1f;clobber flags:CC;}
    91: r133:SI=`setmask'
-   92: r134:SI=[r132:SI*0x4+r133:SI]
-   93: {[r86:SI]=[r86:SI]|r134:SI;clobber flags:CC;}
+   94: r136:SI=[r132:SI*0x4+r133:SI]
+   95: r137:SI=[r86:SI]
+   96: {r135:SI=r137:SI|r136:SI;clobber flags:CC;}
+   97: [r86:SI]=r135:SI
       REG_EQUAL [r86:SI]|[r132:SI*0x4+r133:SI]
    // next basic block

The '+' code is the post-patch code, '-' is pre-patch (the continuous
numbering is due to the '-' code being rejected post-past, so those two
insn UIDs are lost).

Post-patch the extra moves of insns 95 and 97 are never cleaned up. The
extra move is introduced to make room for the REG_EQUAL note that is 
self-referencing in the pre-patch situation (SET of r86 with REG_EQUAL
note using r86). So the pre-patch RTL was indeed invalid.

Combine fails to restore the code to its pre-patch state because it
combines:

   94: r136:SI=[r132:SI*0x4+`setmask']
   95: r137:SI=[r86:SI]
   96: {r135:SI=r137:SI|r136:SI;clobber flags:CC;}

to:

   95: r137:SI=[r86:SI]
   96: {r135:SI=r137:SI|[r132:SI*0x4+`setmask'];clobber flags:CC;}

and the 'setmask' reference make merging the SET of r135 and the store
impossible:

Trying 96 -> 97:
Failed to match this instruction:
(parallel [
        (set (mem:SI (reg/f:SI 86 [ D.1803 ]) [2 *_41+0 S4 A32])
            (ior:SI (reg:SI 137 [ *_41 ])
                (mem:SI (plus:SI (mult:SI (reg:SI 132 [ D.1801 ])
                            (const_int 4 [0x4]))
                        (symbol_ref:SI ("setmask") <var_decl 0x7ffff63fb390
setmask>)) [2 setmaskD.1732 S4 A32])))
        (clobber (reg:CC 17 flags))
    ])
Failed to match this instruction:
(set (mem:SI (reg/f:SI 86 [ D.1803 ]) [2 *_41+0 S4 A32])
    (ior:SI (reg:SI 137 [ *_41 ])
        (mem:SI (plus:SI (mult:SI (reg:SI 132 [ D.1801 ])
                    (const_int 4 [0x4]))
                (symbol_ref:SI ("setmask") <var_decl 0x7ffff63fb390 setmask>))
[2 setmaskD.1732 S4 A32])))


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (3 preceding siblings ...)
  2013-02-03  0:46 ` steven at gcc dot gnu.org
@ 2013-02-03  1:05 ` steven at gcc dot gnu.org
  2013-02-04 10:34 ` ebotcazou at gcc dot gnu.org
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu.org @ 2013-02-03  1:05 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steven at gcc dot gnu.org

--- Comment #4 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-03 01:05:34 UTC ---
Patch to help explain the problem:

Index: optabs.c
===================================================================
--- optabs.c    (revision 195687)
+++ optabs.c    (working copy)
@@ -1452,8 +1452,13 @@ expand_binop_directly (enum machine_mode
     {
       /* If PAT is composed of more than one insn, try to add an appropriate
         REG_EQUAL note to it.  If we can't because TEMP conflicts with an
-        operand, call expand_binop again, this time without a target.  */
+        operand, call expand_binop again, this time without a target.
+
+        However, if target is a MEM, just accept the lossage of not having
+        a REG_EQUAL note.  This avoids splitting up insns of the form
+        "MEM=MEM op X", a form we may not be able to reconstruct later.  */
       if (INSN_P (pat) && NEXT_INSN (pat) != NULL_RTX
+         && ! (target && MEM_P (target))
          && ! add_equal_note (pat, ops[0].value, optab_to_code (binoptab),
                               ops[1].value, ops[2].value))
        {


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (4 preceding siblings ...)
  2013-02-03  1:05 ` steven at gcc dot gnu.org
@ 2013-02-04 10:34 ` ebotcazou at gcc dot gnu.org
  2013-02-04 12:04 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2013-02-04 10:34 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot
                   |                            |gnu.org

--- Comment #5 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2013-02-04 10:34:25 UTC ---
> Patch to help explain the problem:
> 
> Index: optabs.c
> ===================================================================
> --- optabs.c    (revision 195687)
> +++ optabs.c    (working copy)
> @@ -1452,8 +1452,13 @@ expand_binop_directly (enum machine_mode
>      {
>        /* If PAT is composed of more than one insn, try to add an appropriate
>          REG_EQUAL note to it.  If we can't because TEMP conflicts with an
> -        operand, call expand_binop again, this time without a target.  */
> +        operand, call expand_binop again, this time without a target.
> +
> +        However, if target is a MEM, just accept the lossage of not having
> +        a REG_EQUAL note.  This avoids splitting up insns of the form
> +        "MEM=MEM op X", a form we may not be able to reconstruct later.  */
>        if (INSN_P (pat) && NEXT_INSN (pat) != NULL_RTX
> +         && ! (target && MEM_P (target))
>           && ! add_equal_note (pat, ops[0].value, optab_to_code (binoptab),
>                                ops[1].value, ops[2].value))
>         {

I agree that we should try to do something like that.  Generating inferior code
to be able to add a REG_EQUAL that doesn't help in the end isn't very clever.
There is also the same pattern in expand_unop_direct.

Is that the "maximal" test that can be added to catch this case or could it be
made more precise?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (5 preceding siblings ...)
  2013-02-04 10:34 ` ebotcazou at gcc dot gnu.org
@ 2013-02-04 12:04 ` jakub at gcc dot gnu.org
  2013-02-04 17:14 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-04 12:04 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-04 12:03:52 UTC ---
Well, we already have peephole2s for MEM op= ARG (the TARGET_READ_MODIFY_WRITE
guarded peephole2s), the reason why it doesn't trigger in this case is that ARG
in this case ends up being a MEM, and
MEM op= MEM2 isn't a valid insn:

(insn 95 94 96 11 (set (reg:SI 3 bx [orig:135 D.1801 ] [135])
        (mem:SI (reg/f:SI 4 si [orig:86 D.1803 ] [86]) [2 *_41+0 S4 A32]))
pr56151.c:32 89 {*movsi_internal}
     (nil))
(insn 96 95 97 11 (parallel [
            (set (reg:SI 3 bx [orig:135 D.1801 ] [135])
                (ior:SI (reg:SI 3 bx [orig:135 D.1801 ] [135])
                    (mem:SI (plus:SI (mult:SI (reg:SI 6 bp [orig:132 D.1801 ]
[132])
                                (const_int 4 [0x4]))
                            (symbol_ref:SI ("setmask")  <var_decl
0x7f587927a390 setmask>)) [2 setmask S4 A32])))
            (clobber (reg:CC 17 flags))
        ]) pr56151.c:32 421 {*iorsi_1}
     (expr_list:REG_DEAD (reg:SI 6 bp [orig:132 D.1801 ] [132])
        (nil)))
(insn 97 96 98 11 (set (mem:SI (reg/f:SI 4 si [orig:86 D.1803 ] [86]) [2 *_41+0
S4 A32])
        (reg:SI 3 bx [orig:135 D.1801 ] [135])) pr56151.c:32 89
{*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 3 bx [orig:135 D.1801 ] [135])
        (expr_list:REG_DEAD (reg/f:SI 4 si [orig:86 D.1803 ] [86])
            (nil))))

I guess for temp = MEM; temp op= MEM2; MEM = temp; for temp dead afterwards
(this case), we could as well just add another set of the (up to 3?)
peephole2's, that would transform that into
temp = MEM2;
MEM op= temp;
instead.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (6 preceding siblings ...)
  2013-02-04 12:04 ` jakub at gcc dot gnu.org
@ 2013-02-04 17:14 ` jakub at gcc dot gnu.org
  2013-02-04 17:59 ` steven at gcc dot gnu.org
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-04 17:14 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-04 17:13:52 UTC ---
Created attachment 29350
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29350
gcc48-pr56151.patch

Untested patch for the peephole mentioned in previous comment.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (7 preceding siblings ...)
  2013-02-04 17:14 ` jakub at gcc dot gnu.org
@ 2013-02-04 17:59 ` steven at gcc dot gnu.org
  2013-02-04 18:20 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu.org @ 2013-02-04 17:59 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

--- Comment #8 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-04 17:59:22 UTC ---
(In reply to comment #7)
> Created attachment 29350 [details]
> gcc48-pr56151.patch
> 
> Untested patch for the peephole mentioned in previous comment.

I don't think a new set of peepholes is an appropriate solution for
this problem, because the same issue can show up on any CISC-like
target. Before my patch, the code was good after 'expand' by optabs,
perhaps the code your peephole tries to create, can be emitted from
there instead. And perhaps, longer term, combine should be changed
to try the more profitable combination before the one it performs now.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (8 preceding siblings ...)
  2013-02-04 17:59 ` steven at gcc dot gnu.org
@ 2013-02-04 18:20 ` jakub at gcc dot gnu.org
  2013-02-08 15:23 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-04 18:20 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-04 18:20:09 UTC ---
(In reply to comment #8)
> (In reply to comment #7)
> > Created attachment 29350 [details]
> > gcc48-pr56151.patch
> > 
> > Untested patch for the peephole mentioned in previous comment.
> 
> I don't think a new set of peepholes is an appropriate solution for
> this problem, because the same issue can show up on any CISC-like
> target. Before my patch, the code was good after 'expand' by optabs,
> perhaps the code your peephole tries to create, can be emitted from
> there instead. And perhaps, longer term, combine should be changed
> to try the more profitable combination before the one it performs now.

My patch wasn't meant as a replacement of the optabs.c change, but as a
complement.  As the sequence into which it attempts to peephole is shorter and
supposedly faster on some CPUs at least, if such an insn sequence appears in
the IL anywhere during RTL optimizations, it can optimize it.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (9 preceding siblings ...)
  2013-02-04 18:20 ` jakub at gcc dot gnu.org
@ 2013-02-08 15:23 ` rguenth at gcc dot gnu.org
  2013-02-08 15:29 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-08 15:23 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (10 preceding siblings ...)
  2013-02-08 15:23 ` rguenth at gcc dot gnu.org
@ 2013-02-08 15:29 ` jakub at gcc dot gnu.org
  2013-02-11 15:42 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-08 15:29 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-08 15:29:12 UTC ---
(In reply to comment #4)
> Patch to help explain the problem:
> 
> Index: optabs.c
> ===================================================================
> --- optabs.c    (revision 195687)
> +++ optabs.c    (working copy)
> @@ -1452,8 +1452,13 @@ expand_binop_directly (enum machine_mode
>      {
>        /* If PAT is composed of more than one insn, try to add an appropriate
>          REG_EQUAL note to it.  If we can't because TEMP conflicts with an
> -        operand, call expand_binop again, this time without a target.  */
> +        operand, call expand_binop again, this time without a target.
> +
> +        However, if target is a MEM, just accept the lossage of not having
> +        a REG_EQUAL note.  This avoids splitting up insns of the form
> +        "MEM=MEM op X", a form we may not be able to reconstruct later.  */
>        if (INSN_P (pat) && NEXT_INSN (pat) != NULL_RTX
> +         && ! (target && MEM_P (target))
>           && ! add_equal_note (pat, ops[0].value, optab_to_code (binoptab),
>                                ops[1].value, ops[2].value))
>         {

Shouldn't the && ! (target && MEM_P (target))
come after add_equal_note call, not before it?  I.e. shouldn't we try to add
the note if possible?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (11 preceding siblings ...)
  2013-02-08 15:29 ` jakub at gcc dot gnu.org
@ 2013-02-11 15:42 ` jakub at gcc dot gnu.org
  2013-02-12 10:38 ` jakub at gcc dot gnu.org
  2013-02-12 10:39 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-11 15:42 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-11 15:42:12 UTC ---
Created attachment 29419
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29419
gcc48-pr56151.patch

Untested patch, done in add_equal_note, so that the code doesn't have to be
duplicated between expand_binop_directly and expand_unop_directly.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (12 preceding siblings ...)
  2013-02-11 15:42 ` jakub at gcc dot gnu.org
@ 2013-02-12 10:38 ` jakub at gcc dot gnu.org
  2013-02-12 10:39 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-12 10:38 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-12 10:37:42 UTC ---
Author: jakub
Date: Tue Feb 12 10:37:38 2013
New Revision: 195972

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195972
Log:
    PR rtl-optimization/56151
    * optabs.c (add_equal_note): Don't return 0 if target is a MEM,
    equal to op0 or op1, and last_insn pattern is CODE operation
    with MEM dest and one of the operands matches that MEM.

    * gcc.target/i386/pr56151.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr56151.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/optabs.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug rtl-optimization/56151] [4.8 Regression] Performance degradation after r194054 on x86 Atom.
  2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
                   ` (13 preceding siblings ...)
  2013-02-12 10:38 ` jakub at gcc dot gnu.org
@ 2013-02-12 10:39 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-12 10:39 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-12 10:39:17 UTC ---
Fixed.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-02-12 10:39 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-30 15:18 [Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom ysrumyan at gmail dot com
2013-01-30 15:20 ` [Bug tree-optimization/56151] " ysrumyan at gmail dot com
2013-01-30 16:04 ` [Bug rtl-optimization/56151] " rguenth at gcc dot gnu.org
2013-01-31  9:24 ` [Bug rtl-optimization/56151] [4.8 Regression] " rguenth at gcc dot gnu.org
2013-02-03  0:46 ` steven at gcc dot gnu.org
2013-02-03  1:05 ` steven at gcc dot gnu.org
2013-02-04 10:34 ` ebotcazou at gcc dot gnu.org
2013-02-04 12:04 ` jakub at gcc dot gnu.org
2013-02-04 17:14 ` jakub at gcc dot gnu.org
2013-02-04 17:59 ` steven at gcc dot gnu.org
2013-02-04 18:20 ` jakub at gcc dot gnu.org
2013-02-08 15:23 ` rguenth at gcc dot gnu.org
2013-02-08 15:29 ` jakub at gcc dot gnu.org
2013-02-11 15:42 ` jakub at gcc dot gnu.org
2013-02-12 10:38 ` jakub at gcc dot gnu.org
2013-02-12 10:39 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).