* [Bug rtl-optimization/65135] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
@ 2015-02-20 9:56 ` ysrumyan at gmail dot com
2015-02-20 12:43 ` [Bug rtl-optimization/65135] [5 Regression] " rguenth at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: ysrumyan at gmail dot com @ 2015-02-20 9:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
--- Comment #1 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
Created attachment 34814
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34814&action=edit
test-case to reproduce
Need to compile with -O2 -m32 -fPIE -pie options.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
2015-02-20 9:56 ` [Bug rtl-optimization/65135] " ysrumyan at gmail dot com
@ 2015-02-20 12:43 ` rguenth at gcc dot gnu.org
2015-02-20 13:04 ` hjl.tools at gmail dot com
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-20 12:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
CC| |rth at gcc dot gnu.org
Target Milestone|--- |5.0
Summary|Performance regression in |[5 Regression] Performance
|pic mode after r220674. |regression in pic mode
| |after r220674.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
2015-02-20 9:56 ` [Bug rtl-optimization/65135] " ysrumyan at gmail dot com
2015-02-20 12:43 ` [Bug rtl-optimization/65135] [5 Regression] " rguenth at gcc dot gnu.org
@ 2015-02-20 13:04 ` hjl.tools at gmail dot com
2015-02-20 13:07 ` hjl.tools at gmail dot com
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2015-02-20 13:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
The assembly code generates by r220674 is much shorter:
bar:
call __x86.get_pc_thunk.ax
addl $_GLOBAL_OFFSET_TABLE_, %eax
movl FPtr@GOTOFF(%eax), %edx
movl inc@GOTOFF(%eax), %ecx
leal (%edx,%ecx,4), %ecx
cmpl %ecx, FEOF@GOTOFF(%eax)
movl 4(%esp), %ecx
cmovb F@GOTOFF(%eax), %edx
movl %ecx, (%edx)
movl inc@GOTOFF(%eax), %ecx
leal (%edx,%ecx,4), %edx
movl %edx, FPtr@GOTOFF(%eax)
ret
vs
bar:
call __x86.get_pc_thunk.dx
addl $_GLOBAL_OFFSET_TABLE_, %edx
pushl %edi
pushl %esi
movl FPtr@GOT(%edx), %ecx
pushl %ebx
movl inc@GOT(%edx), %ebx
movl FEOF@GOT(%edx), %edi
movl (%ecx), %eax
movl (%ebx), %esi
leal (%eax,%esi,4), %esi
cmpl %esi, (%edi)
jnb .L2
movl F@GOT(%edx), %eax
movl (%eax), %eax
.L2:
movl 16(%esp), %edx
movl %edx, (%eax)
movl (%ebx), %edx
leal (%eax,%edx,4), %eax
movl %eax, (%ecx)
popl %ebx
popl %esi
popl %edi
ret
.size bar, .-bar
Why doesn't it improve performance? Why does it hurt performance instead?
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
` (2 preceding siblings ...)
2015-02-20 13:04 ` hjl.tools at gmail dot com
@ 2015-02-20 13:07 ` hjl.tools at gmail dot com
2015-02-20 15:39 ` hjl.tools at gmail dot com
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2015-02-20 13:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2015-02-20
Ever confirmed|0 |1
--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Yuri Rumyantsev from comment #1)
> Created attachment 34814 [details]
> test-case to reproduce
>
> Need to compile with -O2 -m32 -fPIE -pie options.
Please provide your assembly code.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
` (3 preceding siblings ...)
2015-02-20 13:07 ` hjl.tools at gmail dot com
@ 2015-02-20 15:39 ` hjl.tools at gmail dot com
2015-03-03 16:00 ` law at redhat dot com
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2015-02-20 15:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
CC| |vmakarov at redhat dot com
--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
After r220674, we have
movl z1.1821@GOTOFF(%ebp), %eax
...
movl %eax, %esi
...
movl %esi, 4(%esp)
...
.L6:
movl 4(%esp), %ebx
testl %ebx, %ebx
jne .L9
movl 20(%esp), %eax
movl (%eax,%edx), %eax
cmpl $-1, %eax
je .L11
.L42:
movl 8(%esp), %edi
leal 0(,%eax,4), %edx
movl %ecx, %ebx
addl %edx, %edi
cmpl $101, %ecx
je .L40
movl 12(%esp), %esi
cmpl (%edi), %esi
leal 1(%ebx), %ecx
jne .L6
vs
movl z1.1821@GOTOFF(%ebx), %esi
...
.L8:
testl %esi, %esi
jne .L11
movl 16(%esp), %eax
movl (%eax,%edx), %eax
cmpl $-1, %eax
je .L13
.L43:
movl 4(%esp), %edi
leal 0(,%eax,4), %edx
movl %ecx, %ebx
addl %edx, %edi
cmpl $101, %ecx
je .L41
movl 8(%esp), %ebp
cmpl (%edi), %ebp
leal 1(%ebx), %ecx
jne .L8
RA puts GOT into %ebp and won't use it for this loop.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
` (4 preceding siblings ...)
2015-02-20 15:39 ` hjl.tools at gmail dot com
@ 2015-03-03 16:00 ` law at redhat dot com
2015-03-06 19:43 ` vmakarov at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: law at redhat dot com @ 2015-03-03 16:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
Jeffrey A. Law <law at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
CC| |law at redhat dot com
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
` (5 preceding siblings ...)
2015-03-03 16:00 ` law at redhat dot com
@ 2015-03-06 19:43 ` vmakarov at gcc dot gnu.org
2015-04-22 11:59 ` [Bug rtl-optimization/65135] [5/6 " jakub at gcc dot gnu.org
2015-07-16 9:16 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-03-06 19:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
Vladimir Makarov <vmakarov at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at gcc dot gnu.org
--- Comment #6 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
There is nothing can be done here. RA is all about heuristics not
about optimal solutions.
In this case we have
movl %esi, 4(%esp) # 251 *movsi_internal/2 [length = 4]
movl %eax, 20(%esp) # 276 *movsi_internal/2 [length = 4]
movl R@GOTOFF(%ebp), %eax # 51 *movsi_internal/1 [length
= 6]
movl %eax, 24(%esp) # 277 *movsi_internal/2 [length = 4]
.p2align 4,,10
.p2align 3
.L6:
movl 4(%esp), %ebx # 367 *movsi_internal/1 [length = 4]
testl %ebx, %ebx # 368 *cmpsi_ccno_1/1 [length = 2]
jne .L9 # 75 *jcc_1 [length = 6]
movl 20(%esp), %eax # 280 *movsi_internal/1 [length = 4]
movl (%eax,%edx), %eax # 77 *movsi_internal/1 [length
= 3]
cmpl $-1, %eax # 85 *cmpsi_1/1 [length = 3]
je .L11 # 86 *jcc_1 [length = 6]
.L42:
movl 8(%esp), %edi # 282 *movsi_internal/1 [length = 4]
leal 0(,%eax,4), %edx # 317 *leasi [length = 7]
movl %ecx, %ebx # 90 *movsi_internal/1 [length = 2]
addl %edx, %edi # 89 *addsi_1/1 [length = 2]
cmpl $101, %ecx # 92 *cmpsi_1/1 [length = 3]
je .L40 # 93 *jcc_1 [length = 6]
movl 12(%esp), %esi # 278 *movsi_internal/1 [length = 4]
cmpl (%edi), %esi # 56 *cmpsi_1/2 [length = 2]
leal 1(%ebx), %ecx # 318 *leasi [length = 3]
jne .L6 # 57 *jcc_1 [length = 2]
Insn #251 is created by IRA for live range splitting of p126 around
loop. The live range in the loop uses p154. So after IRA we have
p126 and p154 assigned to SI. Then in live range of p154, LRA
generates reload for insn 56:
Creating newreg=171 from oldreg=92, assigning class GENERAL_REGS to r171
56: flags:CCZ=cmp(r171:SI,[r158:SI])
Inserting insn reload before:
278: r171:SI=r92:SI
and trying to assign a hard reg to p171:
Assigning to 171 (cl=GENERAL_REGS, orig=92, freq=1314, tfirst=171,
tfreq=1314)...
Trying 0: spill 157(freq=2604) Now best 0(cost=1290)
Trying 1: spill 155(freq=2606)
Trying 2: spill 153(freq=2156) Now best 2(cost=842)
Trying 3: spill 156(freq=2320)
Trying 4: spill 154(freq=1584) Now best 4(cost=270)
Trying 5: spill 158(freq=1381)
Spill r154(hr=4, freq=1584) for r171
So LRA chooses to spill p154 as the cheapest one. Skipping some tried
transformations (as optional reload and inheritance), we have at the
end of LRA:
74: flags:CCZ=cmp([sp:SI+0x4],0)
which is transformed in peephole2 pass into:
367: bx:SI=[sp:SI+0x4]
368: flags:CCZ=cmp(bx:SI,0)
75: pc={(flags:CCZ!=0)?L80:pc}
We cannot reuse si for insn 368 as it is also used in insn #56 (this
is reload I mentioned above). So we can not just change [sp:SI+0x4]
to SI because its value can be corrupted after the 1st loop iteration.
The change in a previous part of compiler resulted in different code
before RA and we have a different code after RA.
So this bug will not fixed at least by changes in RA.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5/6 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
` (6 preceding siblings ...)
2015-03-06 19:43 ` vmakarov at gcc dot gnu.org
@ 2015-04-22 11:59 ` jakub at gcc dot gnu.org
2015-07-16 9:16 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-04-22 11:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|5.0 |5.2
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 5.1 has been released.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/65135] [5/6 Regression] Performance regression in pic mode after r220674.
2015-02-20 9:53 [Bug rtl-optimization/65135] New: Performance regression in pic mode after r220674 ysrumyan at gmail dot com
` (7 preceding siblings ...)
2015-04-22 11:59 ` [Bug rtl-optimization/65135] [5/6 " jakub at gcc dot gnu.org
@ 2015-07-16 9:16 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-16 9:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65135
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|5.2 |5.3
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 5.2 is being released, adjusting target milestone to 5.3.
^ permalink raw reply [flat|nested] 10+ messages in thread