public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/65135] New: Performance regression  in pic mode after r220674.
@ 2015-02-20  9:53 ysrumyan at gmail dot com
  2015-02-20  9:56 ` [Bug rtl-optimization/65135] " ysrumyan at gmail dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: ysrumyan at gmail dot com @ 2015-02-20  9:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

            Bug ID: 65135
           Summary: Performance regression  in pic mode after r220674.
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ysrumyan at gmail dot com

We noticed 10% regression on one important benchmark using for testing x86
32-bit platforms. This regression can be reproduced on attached test-case: one
more fill is present in innermost loop after r220674. One possible decision is
spillong  live phisical registers not referenced in (innermost) loops
(GOT-register in out test-case) in loop preheader with subsequent restore it in
loop post header. Note that regression can be seen only in pic mode for 32-bit
x86 platform.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
@ 2015-02-20  9:56 ` ysrumyan at gmail dot com
  2015-02-20 12:43 ` [Bug rtl-optimization/65135] [5 Regression] " rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ysrumyan at gmail dot com @ 2015-02-20  9:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

--- Comment #1 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
Created attachment 34814
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34814&action=edit
test-case to reproduce

Need to compile with -O2 -m32 -fPIE -pie options.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
  2015-02-20  9:56 ` [Bug rtl-optimization/65135] " ysrumyan at gmail dot com
@ 2015-02-20 12:43 ` rguenth at gcc dot gnu.org
  2015-02-20 13:04 ` hjl.tools at gmail dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-20 12:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
                 CC|                            |rth at gcc dot gnu.org
   Target Milestone|---                         |5.0
            Summary|Performance regression  in  |[5 Regression] Performance
                   |pic mode after r220674.     |regression  in pic mode
                   |                            |after r220674.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
  2015-02-20  9:56 ` [Bug rtl-optimization/65135] " ysrumyan at gmail dot com
  2015-02-20 12:43 ` [Bug rtl-optimization/65135] [5 Regression] " rguenth at gcc dot gnu.org
@ 2015-02-20 13:04 ` hjl.tools at gmail dot com
  2015-02-20 13:07 ` hjl.tools at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2015-02-20 13:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
The assembly code generates by r220674 is much shorter:

bar:
    call    __x86.get_pc_thunk.ax
    addl    $_GLOBAL_OFFSET_TABLE_, %eax
    movl    FPtr@GOTOFF(%eax), %edx
    movl    inc@GOTOFF(%eax), %ecx
    leal    (%edx,%ecx,4), %ecx
    cmpl    %ecx, FEOF@GOTOFF(%eax)
    movl    4(%esp), %ecx
    cmovb    F@GOTOFF(%eax), %edx
    movl    %ecx, (%edx)
    movl    inc@GOTOFF(%eax), %ecx
    leal    (%edx,%ecx,4), %edx
    movl    %edx, FPtr@GOTOFF(%eax)
    ret

vs

bar:
    call    __x86.get_pc_thunk.dx
    addl    $_GLOBAL_OFFSET_TABLE_, %edx
    pushl    %edi
    pushl    %esi
    movl    FPtr@GOT(%edx), %ecx
    pushl    %ebx
    movl    inc@GOT(%edx), %ebx
    movl    FEOF@GOT(%edx), %edi
    movl    (%ecx), %eax
    movl    (%ebx), %esi
    leal    (%eax,%esi,4), %esi
    cmpl    %esi, (%edi)
    jnb    .L2
    movl    F@GOT(%edx), %eax
    movl    (%eax), %eax
.L2:
    movl    16(%esp), %edx
    movl    %edx, (%eax)
    movl    (%ebx), %edx
    leal    (%eax,%edx,4), %eax
    movl    %eax, (%ecx)
    popl    %ebx
    popl    %esi
    popl    %edi
    ret
    .size    bar, .-bar

Why doesn't it improve performance? Why does it hurt performance instead?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
                   ` (2 preceding siblings ...)
  2015-02-20 13:04 ` hjl.tools at gmail dot com
@ 2015-02-20 13:07 ` hjl.tools at gmail dot com
  2015-02-20 15:39 ` hjl.tools at gmail dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2015-02-20 13:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2015-02-20
     Ever confirmed|0                           |1

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Yuri Rumyantsev from comment #1)
> Created attachment 34814 [details]
> test-case to reproduce
> 
> Need to compile with -O2 -m32 -fPIE -pie options.

Please provide your assembly code.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
                   ` (3 preceding siblings ...)
  2015-02-20 13:07 ` hjl.tools at gmail dot com
@ 2015-02-20 15:39 ` hjl.tools at gmail dot com
  2015-03-03 16:00 ` law at redhat dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2015-02-20 15:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
                 CC|                            |vmakarov at redhat dot com

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
After r220674, we have

        movl    z1.1821@GOTOFF(%ebp), %eax
...
        movl    %eax, %esi
...
    movl    %esi, 4(%esp)
...
.L6:
    movl    4(%esp), %ebx
    testl    %ebx, %ebx
    jne    .L9
    movl    20(%esp), %eax
    movl    (%eax,%edx), %eax
    cmpl    $-1, %eax
    je    .L11
.L42:
    movl    8(%esp), %edi
    leal    0(,%eax,4), %edx
    movl    %ecx, %ebx
    addl    %edx, %edi
    cmpl    $101, %ecx
    je    .L40
    movl    12(%esp), %esi
    cmpl    (%edi), %esi
    leal    1(%ebx), %ecx
    jne    .L6

vs

        movl    z1.1821@GOTOFF(%ebx), %esi
...
.L8:
    testl    %esi, %esi
    jne    .L11
    movl    16(%esp), %eax
    movl    (%eax,%edx), %eax
    cmpl    $-1, %eax
    je    .L13
.L43:
    movl    4(%esp), %edi
    leal    0(,%eax,4), %edx
    movl    %ecx, %ebx
    addl    %edx, %edi
    cmpl    $101, %ecx
    je    .L41
    movl    8(%esp), %ebp
    cmpl    (%edi), %ebp
    leal    1(%ebx), %ecx
    jne    .L8

RA puts GOT into %ebp and won't use it for this loop.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
                   ` (4 preceding siblings ...)
  2015-02-20 15:39 ` hjl.tools at gmail dot com
@ 2015-03-03 16:00 ` law at redhat dot com
  2015-03-06 19:43 ` vmakarov at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: law at redhat dot com @ 2015-03-03 16:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |law at redhat dot com


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
                   ` (5 preceding siblings ...)
  2015-03-03 16:00 ` law at redhat dot com
@ 2015-03-06 19:43 ` vmakarov at gcc dot gnu.org
  2015-04-22 11:59 ` [Bug rtl-optimization/65135] [5/6 " jakub at gcc dot gnu.org
  2015-07-16  9:16 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-03-06 19:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

Vladimir Makarov <vmakarov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #6 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
There is nothing can be done here.  RA is all about heuristics not
about optimal solutions.

In this case we have

        movl    %esi, 4(%esp)   # 251   *movsi_internal/2       [length = 4]
        movl    %eax, 20(%esp)  # 276   *movsi_internal/2       [length = 4]
        movl    R@GOTOFF(%ebp), %eax    # 51    *movsi_internal/1       [length
= 6]
        movl    %eax, 24(%esp)  # 277   *movsi_internal/2       [length = 4]
        .p2align 4,,10
        .p2align 3
.L6:
        movl    4(%esp), %ebx   # 367   *movsi_internal/1       [length = 4]
        testl   %ebx, %ebx      # 368   *cmpsi_ccno_1/1 [length = 2]
        jne     .L9     # 75    *jcc_1  [length = 6]
        movl    20(%esp), %eax  # 280   *movsi_internal/1       [length = 4]
        movl    (%eax,%edx), %eax       # 77    *movsi_internal/1       [length
= 3]
        cmpl    $-1, %eax       # 85    *cmpsi_1/1      [length = 3]
        je      .L11    # 86    *jcc_1  [length = 6]
.L42:
        movl    8(%esp), %edi   # 282   *movsi_internal/1       [length = 4]
        leal    0(,%eax,4), %edx        # 317   *leasi  [length = 7]
        movl    %ecx, %ebx      # 90    *movsi_internal/1       [length = 2]
        addl    %edx, %edi      # 89    *addsi_1/1      [length = 2]
        cmpl    $101, %ecx      # 92    *cmpsi_1/1      [length = 3]
        je      .L40    # 93    *jcc_1  [length = 6]
        movl    12(%esp), %esi  # 278   *movsi_internal/1       [length = 4]
        cmpl    (%edi), %esi    # 56    *cmpsi_1/2      [length = 2]
        leal    1(%ebx), %ecx   # 318   *leasi  [length = 3]
        jne     .L6     # 57    *jcc_1  [length = 2]


Insn #251 is created by IRA for live range splitting of p126 around
loop.  The live range in the loop uses p154.  So after IRA we have
p126 and p154 assigned to SI.  Then in live range of p154, LRA
generates reload for insn 56:

      Creating newreg=171 from oldreg=92, assigning class GENERAL_REGS to r171
   56: flags:CCZ=cmp(r171:SI,[r158:SI])
    Inserting insn reload before:
  278: r171:SI=r92:SI

and trying to assign a hard reg to p171:

    Assigning to 171 (cl=GENERAL_REGS, orig=92, freq=1314, tfirst=171,
tfreq=1314)...
         Trying 0: spill 157(freq=2604)  Now best 0(cost=1290)

         Trying 1: spill 155(freq=2606)
         Trying 2: spill 153(freq=2156)  Now best 2(cost=842)

         Trying 3: spill 156(freq=2320)
         Trying 4: spill 154(freq=1584)  Now best 4(cost=270)

         Trying 5: spill 158(freq=1381)
      Spill r154(hr=4, freq=1584) for r171

So LRA chooses to spill p154 as the cheapest one.  Skipping some tried
transformations (as optional reload and inheritance), we have at the
end of LRA:

   74: flags:CCZ=cmp([sp:SI+0x4],0)

which is transformed in peephole2 pass into:

  367: bx:SI=[sp:SI+0x4]
  368: flags:CCZ=cmp(bx:SI,0)
   75: pc={(flags:CCZ!=0)?L80:pc}

We cannot reuse si for insn 368 as it is also used in insn #56 (this
is reload I mentioned above).  So we can not just change [sp:SI+0x4]
to SI because its value can be corrupted after the 1st loop iteration.

The change in a previous part of compiler resulted in different code
before RA and we have a different code after RA.

So this bug will not fixed at least by changes in RA.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5/6 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
                   ` (6 preceding siblings ...)
  2015-03-06 19:43 ` vmakarov at gcc dot gnu.org
@ 2015-04-22 11:59 ` jakub at gcc dot gnu.org
  2015-07-16  9:16 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-04-22 11:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|5.0                         |5.2

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 5.1 has been released.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/65135] [5/6 Regression] Performance regression  in pic mode after r220674.
  2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
                   ` (7 preceding siblings ...)
  2015-04-22 11:59 ` [Bug rtl-optimization/65135] [5/6 " jakub at gcc dot gnu.org
@ 2015-07-16  9:16 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-16  9:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|5.2                         |5.3

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 5.2 is being released, adjusting target milestone to 5.3.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-07-16  9:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-20  9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
2015-02-20  9:56 ` [Bug rtl-optimization/65135] " ysrumyan at gmail dot com
2015-02-20 12:43 ` [Bug rtl-optimization/65135] [5 Regression] " rguenth at gcc dot gnu.org
2015-02-20 13:04 ` hjl.tools at gmail dot com
2015-02-20 13:07 ` hjl.tools at gmail dot com
2015-02-20 15:39 ` hjl.tools at gmail dot com
2015-03-03 16:00 ` law at redhat dot com
2015-03-06 19:43 ` vmakarov at gcc dot gnu.org
2015-04-22 11:59 ` [Bug rtl-optimization/65135] [5/6 " jakub at gcc dot gnu.org
2015-07-16  9:16 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).