public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
@ 2011-06-12 12:38 ` rguenth at gcc dot gnu.org
2011-12-08 3:35 ` pinskia at gcc dot gnu.org
` (8 subsequent siblings)
9 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-06-12 12:38 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |4.4.7
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
2011-06-12 12:38 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit rguenth at gcc dot gnu.org
@ 2011-12-08 3:35 ` pinskia at gcc dot gnu.org
2011-12-08 3:37 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
9 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-08 3:35 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-08 03:34:57 UTC ---
-O1:
.L2:
movl %eax, %edx
sarl $31, %edx
shrl $24, %edx
leal (%eax,%edx), %ecx
andl $255, %ecx
subl %edx, %ecx
movb %cl, (%ebx,%eax)
addl $1, %eax
cmpl $8192, %eax
jne .L2
-O2:
.L4:
movl 76(%esp), %ecx
movl %edi, %ebx
shrl $24, %ebx
movzbl (%ecx), %edx
addl $1, %ecx
movl %ecx, 76(%esp)
movl %esi, %ecx
sall $8, %ecx
xorl %ebx, %edx
movl %edi, %ebx
shldl $8, %esi, %ebx
movl crc_table+4(,%edx,8), %edi
movl crc_table(,%edx,8), %esi
xorl %ebx, %edi
xorl %ecx, %esi
cmpl %eax, 76(%esp)
jne .L4
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
2011-06-12 12:38 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit rguenth at gcc dot gnu.org
2011-12-08 3:35 ` pinskia at gcc dot gnu.org
@ 2011-12-08 3:37 ` pinskia at gcc dot gnu.org
2011-12-08 3:53 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
9 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-08 3:37 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-08 03:37:28 UTC ---
-O1:
<bb 5>:
# __crc0_73 = PHI <__crc0_35(5), __crc0_54(7)>
# __data_75 = PHI <__data_32(5), data_7(7)>
D.1900_26 = __crc0_73 >> 56;
D.1901_27 = (int) D.1900_26;
D.1902_28 = MEM[base: __data_75, offset: 0B];
D.1903_29 = (int) D.1902_28;
D.1904_30 = D.1903_29 ^ D.1901_27;
__tab_index_31 = D.1904_30 & 255;
__data_32 = __data_75 + 1;
D.1905_33 = crc_table[__tab_index_31];
D.1906_34 = __crc0_73 << 8;
__crc0_35 = D.1905_33 ^ D.1906_34;
if (__data_32 != D.1928_55)
goto <bb 5>;
else
goto <bb 6>;
-O2:
<bb 5>:
# __crc0_1 = PHI <__crc0_35(5), __crc0_54(7)>
# __data_67 = PHI <__data_32(5), data_7(7)>
D.1900_26 = __crc0_1 >> 56;
D.1901_27 = (int) D.1900_26;
D.1902_28 = MEM[base: __data_67, offset: 0B];
D.1903_29 = (int) D.1902_28;
D.1904_30 = D.1903_29 ^ D.1901_27;
__tab_index_31 = D.1904_30 & 255;
__data_32 = __data_67 + 1;
D.1905_33 = crc_table[__tab_index_31];
D.1906_34 = __crc0_1 << 8;
__crc0_35 = D.1905_33 ^ D.1906_34;
if (__data_32 != D.1955_86)
goto <bb 5>;
else
goto <bb 6>;
Aka nothing on the tree level causes the issue.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2011-12-08 3:37 ` pinskia at gcc dot gnu.org
@ 2011-12-08 3:53 ` pinskia at gcc dot gnu.org
2011-12-09 19:16 ` vmakarov at redhat dot com
` (5 subsequent siblings)
9 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-08 3:53 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |ra
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-08 03:53:22 UTC ---
This is definitely a register allocation issue because the RTL is basically the
same when entering IRA.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2011-12-08 3:53 ` pinskia at gcc dot gnu.org
@ 2011-12-09 19:16 ` vmakarov at redhat dot com
2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 10+ messages in thread
From: vmakarov at redhat dot com @ 2011-12-09 19:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
Vladimir Makarov <vmakarov at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at redhat dot com
--- Comment #6 from Vladimir Makarov <vmakarov at redhat dot com> 2011-12-09 19:09:52 UTC ---
There is small difference in the code which results in such degradation.
-O1 generates an insn in the major loop
(insn 43 42 44 5 /home/cygnus/vmakarov/build1/trunk/crctest64.c:241 (parallel [
(set (reg/v:SI 77 [ __tab_index ])
(xor:SI (reg:SI 108)
(reg:SI 120)))
(clobber (reg:CC 17 flags))
]) 395 {*xorsi_1} (expr_list:REG_DEAD (reg:SI 108)
(expr_list:REG_DEAD (reg:SI 120)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))))
-O2 generates analogous insn
(insn 39 38 40 5 /home/cygnus/vmakarov/build1/trunk/crctest64.c:241 (parallel [
(set (reg/v:SI 83 [ __tab_index ])
(xor:SI (reg/v:SI 83 [ __tab_index ])
(reg:SI 143)))
(clobber (reg:CC 17 flags))
]) 395 {*xorsi_1} (expr_list:REG_DEAD (reg:SI 143)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil))))
The reason for the difference because of regmove optimization.
The RTL insn in the second variant looks even better but it makes
pseudo 83 most frequently used and assigned first by pushing it last
to the coloring stack between bunch trivially colorable pseudos. The
set of trivially colorable pseudos contains two double word pseudos
which need two adjacent hard registers each. Assigning pseudo 83
first (the case is complicated more because some pseudos cross calls)
results in presence of only one pair of adjacent hard registers
although there are still 2 free hard register for the second double
word pseudos but they are not adjacent. It results in spilling of one
double word pseudo and code performance degradation.
For -O1 analog pseudo 83 (p77) is assigned last after assigning to two
double word pseudos and spilling does not occur.
To solve the problem we should increase probability of keeping free
hard registers adjacent. It can be done by pushing multi-word pseudos
last to the coloring stack and as consequence to assign them first by
modifying function bucket_allocno_compare_func. I did the problem was
solved unfortunately, it results in 2% performance degradation of
SPEC2000 perlbmk although there is a small code size improvement on
SPEC2000 with this heuristic.
On a general note, RA allocation is all about heuristics. So it is
possible to find a test where it will work worse than other
heuristics. The most important that RA works well in overall (on big
credible set of tests). With this point of view IRA is much better
than the previous register allocator.
But because crc code is important, I'll continue the work on tuning
which does not degrade SPEC2000 and which does solve problem.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2011-12-09 19:16 ` vmakarov at redhat dot com
@ 2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
2011-12-12 21:26 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6 " jakub at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 10+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2011-12-12 21:00 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
--- Comment #7 from Vladimir Makarov <vmakarov at gcc dot gnu.org> 2011-12-12 20:51:19 UTC ---
Author: vmakarov
Date: Mon Dec 12 20:51:16 2011
New Revision: 182263
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182263
Log:
2011-12-12 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/21617
* ira-color.c (bucket_allocno_compare_func): Don't compare
allocno classes. Compare number of hard registers needed.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ira-color.c
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.4/4.5/4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
@ 2011-12-12 21:26 ` jakub at gcc dot gnu.org
2012-03-13 15:31 ` [Bug rtl-optimization/21617] [4.5/4.6 " jakub at gcc dot gnu.org
` (2 subsequent siblings)
9 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-12-12 21:26 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
Summary|[4.4/4.5/4.6/4.7 |[4.4/4.5/4.6 Regression]
|Regression] CRC64 algorithm |CRC64 algorithm
|optimization problem on |optimization problem on
|Intel 32-bit |Intel 32-bit
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-12-12 21:15:23 UTC ---
Fixed on the trunk. I don't think it is desirable to backport this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.5/4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2011-12-12 21:26 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6 " jakub at gcc dot gnu.org
@ 2012-03-13 15:31 ` jakub at gcc dot gnu.org
2012-07-02 12:11 ` rguenth at gcc dot gnu.org
2013-04-12 16:18 ` [Bug rtl-optimization/21617] [4.6 " jakub at gcc dot gnu.org
9 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-03-13 15:31 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.4.7 |4.5.4
--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-13 12:48:16 UTC ---
4.4 branch is being closed, moving to 4.5.4 target.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.5/4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2012-03-13 15:31 ` [Bug rtl-optimization/21617] [4.5/4.6 " jakub at gcc dot gnu.org
@ 2012-07-02 12:11 ` rguenth at gcc dot gnu.org
2013-04-12 16:18 ` [Bug rtl-optimization/21617] [4.6 " jakub at gcc dot gnu.org
9 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-02 12:11 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.5.4 |4.6.4
--- Comment #10 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-02 12:10:39 UTC ---
The 4.5 branch is being closed, adjusting target milestone.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/21617] [4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
` (8 preceding siblings ...)
2012-07-02 12:11 ` rguenth at gcc dot gnu.org
@ 2013-04-12 16:18 ` jakub at gcc dot gnu.org
9 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-04-12 16:18 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
Target Milestone|4.6.4 |4.7.0
--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-04-12 16:17:59 UTC ---
The 4.6 branch has been closed, fixed in GCC 4.7.0.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-04-12 16:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
2011-06-12 12:38 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit rguenth at gcc dot gnu.org
2011-12-08 3:35 ` pinskia at gcc dot gnu.org
2011-12-08 3:37 ` pinskia at gcc dot gnu.org
2011-12-08 3:53 ` pinskia at gcc dot gnu.org
2011-12-09 19:16 ` vmakarov at redhat dot com
2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
2011-12-12 21:26 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6 " jakub at gcc dot gnu.org
2012-03-13 15:31 ` [Bug rtl-optimization/21617] [4.5/4.6 " jakub at gcc dot gnu.org
2012-07-02 12:11 ` rguenth at gcc dot gnu.org
2013-04-12 16:18 ` [Bug rtl-optimization/21617] [4.6 " jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).