public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
@ 2011-06-12 12:38 ` rguenth at gcc dot gnu.org
  2011-12-08  3:35 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-06-12 12:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.4.7


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
  2011-06-12 12:38 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit rguenth at gcc dot gnu.org
@ 2011-12-08  3:35 ` pinskia at gcc dot gnu.org
  2011-12-08  3:37 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-08  3:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-08 03:34:57 UTC ---
-O1:
.L2:
    movl    %eax, %edx
    sarl    $31, %edx
    shrl    $24, %edx
    leal    (%eax,%edx), %ecx
    andl    $255, %ecx
    subl    %edx, %ecx
    movb    %cl, (%ebx,%eax)
    addl    $1, %eax
    cmpl    $8192, %eax
    jne    .L2

-O2:
.L4:
    movl    76(%esp), %ecx
    movl    %edi, %ebx
    shrl    $24, %ebx
    movzbl    (%ecx), %edx
    addl    $1, %ecx
    movl    %ecx, 76(%esp)
    movl    %esi, %ecx
    sall    $8, %ecx
    xorl    %ebx, %edx
    movl    %edi, %ebx
    shldl    $8, %esi, %ebx
    movl    crc_table+4(,%edx,8), %edi
    movl    crc_table(,%edx,8), %esi
    xorl    %ebx, %edi
    xorl    %ecx, %esi
    cmpl    %eax, 76(%esp)
    jne    .L4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
  2011-06-12 12:38 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit rguenth at gcc dot gnu.org
  2011-12-08  3:35 ` pinskia at gcc dot gnu.org
@ 2011-12-08  3:37 ` pinskia at gcc dot gnu.org
  2011-12-08  3:53 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-08  3:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-08 03:37:28 UTC ---
-O1:

<bb 5>:
  # __crc0_73 = PHI <__crc0_35(5), __crc0_54(7)>
  # __data_75 = PHI <__data_32(5), data_7(7)>
  D.1900_26 = __crc0_73 >> 56;
  D.1901_27 = (int) D.1900_26;
  D.1902_28 = MEM[base: __data_75, offset: 0B];
  D.1903_29 = (int) D.1902_28;
  D.1904_30 = D.1903_29 ^ D.1901_27;
  __tab_index_31 = D.1904_30 & 255;
  __data_32 = __data_75 + 1;
  D.1905_33 = crc_table[__tab_index_31];
  D.1906_34 = __crc0_73 << 8;
  __crc0_35 = D.1905_33 ^ D.1906_34;
  if (__data_32 != D.1928_55)
    goto <bb 5>;
  else
    goto <bb 6>;

-O2:
<bb 5>:
  # __crc0_1 = PHI <__crc0_35(5), __crc0_54(7)>
  # __data_67 = PHI <__data_32(5), data_7(7)>
  D.1900_26 = __crc0_1 >> 56;
  D.1901_27 = (int) D.1900_26;
  D.1902_28 = MEM[base: __data_67, offset: 0B];
  D.1903_29 = (int) D.1902_28;
  D.1904_30 = D.1903_29 ^ D.1901_27;
  __tab_index_31 = D.1904_30 & 255;
  __data_32 = __data_67 + 1;
  D.1905_33 = crc_table[__tab_index_31];
  D.1906_34 = __crc0_1 << 8;
  __crc0_35 = D.1905_33 ^ D.1906_34;
  if (__data_32 != D.1955_86)
    goto <bb 5>;
  else
    goto <bb 6>;

Aka nothing on the tree level causes the issue.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2011-12-08  3:37 ` pinskia at gcc dot gnu.org
@ 2011-12-08  3:53 ` pinskia at gcc dot gnu.org
  2011-12-09 19:16 ` vmakarov at redhat dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-08  3:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ra

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-08 03:53:22 UTC ---
This is definitely a register allocation issue because the RTL is basically the
same when entering IRA.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2011-12-08  3:53 ` pinskia at gcc dot gnu.org
@ 2011-12-09 19:16 ` vmakarov at redhat dot com
  2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: vmakarov at redhat dot com @ 2011-12-09 19:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Vladimir Makarov <vmakarov at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at redhat dot com

--- Comment #6 from Vladimir Makarov <vmakarov at redhat dot com> 2011-12-09 19:09:52 UTC ---
There is small difference in the code which results in such degradation.

-O1 generates an insn in the major loop

(insn 43 42 44 5 /home/cygnus/vmakarov/build1/trunk/crctest64.c:241 (parallel [
            (set (reg/v:SI 77 [ __tab_index ])
        (xor:SI (reg:SI 108)
                    (reg:SI 120)))
            (clobber (reg:CC 17 flags))
        ]) 395 {*xorsi_1} (expr_list:REG_DEAD (reg:SI 108)
        (expr_list:REG_DEAD (reg:SI 120)
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))

-O2 generates analogous insn

(insn 39 38 40 5 /home/cygnus/vmakarov/build1/trunk/crctest64.c:241 (parallel [
            (set (reg/v:SI 83 [ __tab_index ])
                (xor:SI (reg/v:SI 83 [ __tab_index ])
                    (reg:SI 143)))
            (clobber (reg:CC 17 flags))
        ]) 395 {*xorsi_1} (expr_list:REG_DEAD (reg:SI 143)
    (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

The reason for the difference because of regmove optimization.

The RTL insn in the second variant looks even better but it makes
pseudo 83 most frequently used and assigned first by pushing it last
to the coloring stack between bunch trivially colorable pseudos.  The
set of trivially colorable pseudos contains two double word pseudos
which need two adjacent hard registers each.  Assigning pseudo 83
first (the case is complicated more because some pseudos cross calls)
results in presence of only one pair of adjacent hard registers
although there are still 2 free hard register for the second double
word pseudos but they are not adjacent.  It results in spilling of one
double word pseudo and code performance degradation.

For -O1 analog pseudo 83 (p77) is assigned last after assigning to two
double word pseudos and spilling does not occur.

To solve the problem we should increase probability of keeping free
hard registers adjacent.  It can be done by pushing multi-word pseudos
last to the coloring stack and as consequence to assign them first by
modifying function bucket_allocno_compare_func.  I did the problem was
solved unfortunately, it results in 2% performance degradation of
SPEC2000 perlbmk although there is a small code size improvement on
SPEC2000 with this heuristic.

On a general note, RA allocation is all about heuristics.  So it is
possible to find a test where it will work worse than other
heuristics.  The most important that RA works well in overall (on big
credible set of tests).  With this point of view IRA is much better
than the previous register allocator.

But because crc code is important, I'll continue the work on tuning
which does not degrade SPEC2000 and which does solve problem.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2011-12-09 19:16 ` vmakarov at redhat dot com
@ 2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
  2011-12-12 21:26 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6 " jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2011-12-12 21:00 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

--- Comment #7 from Vladimir Makarov <vmakarov at gcc dot gnu.org> 2011-12-12 20:51:19 UTC ---
Author: vmakarov
Date: Mon Dec 12 20:51:16 2011
New Revision: 182263

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182263
Log:
2011-12-12  Vladimir Makarov  <vmakarov@redhat.com>

    PR rtl-optimization/21617
    * ira-color.c (bucket_allocno_compare_func): Don't compare
    allocno classes.  Compare number of hard registers needed.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ira-color.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.4/4.5/4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
@ 2011-12-12 21:26 ` jakub at gcc dot gnu.org
  2012-03-13 15:31 ` [Bug rtl-optimization/21617] [4.5/4.6 " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-12-12 21:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org
            Summary|[4.4/4.5/4.6/4.7            |[4.4/4.5/4.6 Regression]
                   |Regression] CRC64 algorithm |CRC64 algorithm
                   |optimization problem on     |optimization problem on
                   |Intel 32-bit                |Intel 32-bit

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-12-12 21:15:23 UTC ---
Fixed on the trunk.  I don't think it is desirable to backport this.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.5/4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2011-12-12 21:26 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6 " jakub at gcc dot gnu.org
@ 2012-03-13 15:31 ` jakub at gcc dot gnu.org
  2012-07-02 12:11 ` rguenth at gcc dot gnu.org
  2013-04-12 16:18 ` [Bug rtl-optimization/21617] [4.6 " jakub at gcc dot gnu.org
  9 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-03-13 15:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.4.7                       |4.5.4

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-13 12:48:16 UTC ---
4.4 branch is being closed, moving to 4.5.4 target.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.5/4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2012-03-13 15:31 ` [Bug rtl-optimization/21617] [4.5/4.6 " jakub at gcc dot gnu.org
@ 2012-07-02 12:11 ` rguenth at gcc dot gnu.org
  2013-04-12 16:18 ` [Bug rtl-optimization/21617] [4.6 " jakub at gcc dot gnu.org
  9 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-02 12:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.5.4                       |4.6.4

--- Comment #10 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-02 12:10:39 UTC ---
The 4.5 branch is being closed, adjusting target milestone.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/21617] [4.6 Regression] CRC64 algorithm optimization problem on Intel 32-bit
       [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2012-07-02 12:11 ` rguenth at gcc dot gnu.org
@ 2013-04-12 16:18 ` jakub at gcc dot gnu.org
  9 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-04-12 16:18 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21617

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|4.6.4                       |4.7.0

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-04-12 16:17:59 UTC ---
The 4.6 branch has been closed, fixed in GCC 4.7.0.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-04-12 16:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-21617-4@http.gcc.gnu.org/bugzilla/>
2011-06-12 12:38 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6/4.7 Regression] CRC64 algorithm optimization problem on Intel 32-bit rguenth at gcc dot gnu.org
2011-12-08  3:35 ` pinskia at gcc dot gnu.org
2011-12-08  3:37 ` pinskia at gcc dot gnu.org
2011-12-08  3:53 ` pinskia at gcc dot gnu.org
2011-12-09 19:16 ` vmakarov at redhat dot com
2011-12-12 21:00 ` vmakarov at gcc dot gnu.org
2011-12-12 21:26 ` [Bug rtl-optimization/21617] [4.4/4.5/4.6 " jakub at gcc dot gnu.org
2012-03-13 15:31 ` [Bug rtl-optimization/21617] [4.5/4.6 " jakub at gcc dot gnu.org
2012-07-02 12:11 ` rguenth at gcc dot gnu.org
2013-04-12 16:18 ` [Bug rtl-optimization/21617] [4.6 " jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).