public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.
@ 2013-11-07 10:16 ysrumyan at gmail dot com
2013-11-07 10:18 ` [Bug rtl-optimization/59036] " ysrumyan at gmail dot com
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: ysrumyan at gmail dot com @ 2013-11-07 10:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036
Bug ID: 59036
Summary: [4.9 regression] Performance degradation after r204212
on 32-bit x86 targets.
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
After patch to improve register preferencing in IRA and to *remove regmove*
pass we noticed performance degradation on several benchmarks from eembc2.0
suite in 32-bit mode for all x86 targets (such as atom, slm, hsw, etc.).
This can be reproduced with attached test-case - after fix 3 more instructions
are generated for innermost loop (compiled with -O2 -m32 -march=core-avx2
options):
before fix
.L4:
movl 12(%esp), %edx
addl $3, %ecx
movl 4(%esp), %ebx
movl (%esp), %ebp
movl 8(%esp), %esi
movzbl (%edx,%eax), %edi
movl 16(%esp), %edx
movzbl (%ebx,%eax), %ebx
movzbl (%esi,%eax), %esi
addl $1, %eax
addl (%edx,%edi,4), %ebp
movzbl 0(%ebp,%ebx), %edx
movl 28(%esp), %ebp
movb %dl, -3(%ecx)
movl 24(%esp), %edx
movl (%edx,%edi,4), %edx
movl (%esp), %edi
addl 0(%ebp,%esi,4), %edx
leal (%edi,%ebx), %ebp
sarl $16, %edx
movzbl 0(%ebp,%edx), %edx
movl 20(%esp), %ebp
movb %dl, -2(%ecx)
movl 0(%ebp,%esi,4), %edx
addl %edi, %edx
movzbl (%edx,%ebx), %edx
movb %dl, -1(%ecx)
cmpl 80(%esp), %eax
jne .L4
after fix
.L4:
movl 8(%esp), %ebx
addl $3, %edx
movl 12(%esp), %esi
movl 4(%esp), %ecx
movzbl (%ebx,%eax), %ebx
movzbl (%esi,%eax), %esi
movzbl (%ecx,%eax), %ecx
addl $1, %eax
movb %bl, (%esp)
movl 16(%esp), %ebx
movl (%ebx,%esi,4), %ebp
addl %edi, %ebp
movzbl 0(%ebp,%ecx), %ebx
movzbl (%esp), %ebp
movb %bl, -3(%edx)
movl 24(%esp), %ebx
movl %ebp, (%esp)
movl (%ebx,%esi,4), %esi
movl 28(%esp), %ebx
addl (%ebx,%ebp,4), %esi
leal (%edi,%ecx), %ebp
sarl $16, %esi
movzbl 0(%ebp,%esi), %ebx
movl 20(%esp), %esi
movl (%esp), %ebp
movb %bl, -2(%edx)
movl %edi, %ebx
addl (%esi,%ebp,4), %ebx
movzbl (%ebx,%ecx), %ecx
movb %cl, -1(%edx)
cmpl 80(%esp), %eax
jne .L4
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/59036] [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.
2013-11-07 10:16 [Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets ysrumyan at gmail dot com
@ 2013-11-07 10:18 ` ysrumyan at gmail dot com
2013-11-07 10:59 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: ysrumyan at gmail dot com @ 2013-11-07 10:18 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036
--- Comment #1 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
Created attachment 31178
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31178&action=edit
test-case to reproduce
test need to be compiled with -m32 option for any x86 targets.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/59036] [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.
2013-11-07 10:16 [Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets ysrumyan at gmail dot com
2013-11-07 10:18 ` [Bug rtl-optimization/59036] " ysrumyan at gmail dot com
@ 2013-11-07 10:59 ` rguenth at gcc dot gnu.org
2013-11-07 15:34 ` vmakarov at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-11-07 10:59 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |i?86-*-*
CC| |vmakarov at gcc dot gnu.org
Target Milestone|--- |4.9.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/59036] [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.
2013-11-07 10:16 [Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets ysrumyan at gmail dot com
2013-11-07 10:18 ` [Bug rtl-optimization/59036] " ysrumyan at gmail dot com
2013-11-07 10:59 ` rguenth at gcc dot gnu.org
@ 2013-11-07 15:34 ` vmakarov at gcc dot gnu.org
2013-11-13 18:01 ` vmakarov at gcc dot gnu.org
2013-11-19 15:04 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2013-11-07 15:34 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036
--- Comment #2 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Yuri Rumyantsev from comment #0)
> After patch to improve register preferencing in IRA and to *remove regmove*
> pass we noticed performance degradation on several benchmarks from eembc2.0
> suite in 32-bit mode for all x86 targets (such as atom, slm, hsw, etc.).
> This can be reproduced with attached test-case - after fix 3 more
> instructions are generated for innermost loop (compiled with -O2 -m32
> -march=core-avx2 options):
>
I am just curious what is the overall score change? Are there only performance
degradations? Was something improved?
In general would you prefer to reverse this patch? Because I am affraid, it
will be only solution for the PR.
I am asking this because very frequently heuristic based optimizations generate
something better and something worse. That is their nature.
When I worked on this optimization I had to change about 15 tests from GCC
testsuites checking AVX and found that in every tests uneccessary register
shuffling moves were deleted after applying the patch.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/59036] [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.
2013-11-07 10:16 [Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets ysrumyan at gmail dot com
` (2 preceding siblings ...)
2013-11-07 15:34 ` vmakarov at gcc dot gnu.org
@ 2013-11-13 18:01 ` vmakarov at gcc dot gnu.org
2013-11-19 15:04 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2013-11-13 18:01 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036
--- Comment #3 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
Author: vmakarov
Date: Wed Nov 13 18:00:43 2013
New Revision: 204752
URL: http://gcc.gnu.org/viewcvs?rev=204752&root=gcc&view=rev
Log:
2013-11-13 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/59036
* ira-color.c (struct allocno_color_data): Add new members
first_thread_allocno, next_thread_allocno, thread_freq.
(sorted_copies): New static var.
(allocnos_conflict_by_live_ranges_p, copy_freq_compare_func): Move
up.
(allocno_thread_conflict_p, merge_threads)
(form_threads_from_copies, form_threads_from_bucket)
(form_threads_from_colorable_allocno, init_allocno_threads): New
functions.
(bucket_allocno_compare_func): Add comparison by thread frequency
and threads.
(add_allocno_to_ordered_bucket): Rename to
add_allocno_to_ordered_colorable_bucket. Remove parameter.
(push_only_colorable): Call form_threads_from_bucket.
(color_pass): Call init_allocno_threads. Use
consideration_allocno_bitmap instead of coloring_allocno_bitmap
for nuillify allocno color data.
(ira_initiate_assign, ira_finish_assign): Allocate/free
sorted_copies.
(coalesce_allocnos): Use static sorted copies.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ira-color.c
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/59036] [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.
2013-11-07 10:16 [Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets ysrumyan at gmail dot com
` (3 preceding siblings ...)
2013-11-13 18:01 ` vmakarov at gcc dot gnu.org
@ 2013-11-19 15:04 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-11-19 15:04 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I suppose fixed.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-11-19 15:04 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-07 10:16 [Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets ysrumyan at gmail dot com
2013-11-07 10:18 ` [Bug rtl-optimization/59036] " ysrumyan at gmail dot com
2013-11-07 10:59 ` rguenth at gcc dot gnu.org
2013-11-07 15:34 ` vmakarov at gcc dot gnu.org
2013-11-13 18:01 ` vmakarov at gcc dot gnu.org
2013-11-19 15:04 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).