public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/36539]  New: [4.4 regression] IRA doesn't allocate asm output being returned to eax
@ 2008-06-14  6:48 astrange at ithinksw dot com
  2008-06-14  6:49 ` [Bug target/36539] " astrange at ithinksw dot com
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: astrange at ithinksw dot com @ 2008-06-14  6:48 UTC (permalink / raw)
  To: gcc-bugs

Using today's IRA branch (r136683), on the attached file.

> gcc -O3 -fno-pic -fomit-frame-pointer -m64 -S cabac-ret.i -fira
_get_cabac:
LFB2:
        pushq   %rbx
LCFI0:
        movl    (%rdi), %eax
        movl    4(%rdi), %r8d
# 16 "cabac-ret.i" 1
        #%ebx %r8d %ax 24(%rdi) %rsi
# 0 "" 2
        movl    %eax, (%rdi)
        movl    %r8d, 4(%rdi)
        movl    %ebx, %eax
        popq    %rbx
        andl    $1, %eax
        ret

with an unnecessary mov %ebx, %eax. Without -fira:
        movl    (%rdi), %r8d
        movl    4(%rdi), %r9d
# 16 "cabac-ret.i" 1
        #%eax %r9d %r8w 24(%rdi) %rsi
# 0 "" 2
        movl    %r8d, (%rdi)
        movl    %r9d, 4(%rdi)
        andl    $1, %eax
        ret

Both allocators don't allocate bit to eax in 32-bit mode, though all other
compilers with inline asm support I tried did. gcc 3.3 does, as well, but no
other version seemed to.

In this case it's not a problem, since changing the class to "=&a" fixes it,
but the function will be inlined a lot and I don't want to put unnecessary
constraints on it.


-- 
           Summary: [4.4 regression] IRA doesn't allocate asm output being
                    returned to eax
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: astrange at ithinksw dot com
GCC target triplet: x86_64-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] [4.4 regression] IRA doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
@ 2008-06-14  6:49 ` astrange at ithinksw dot com
  2008-06-14  6:53 ` [Bug target/36539] " pinskia at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: astrange at ithinksw dot com @ 2008-06-14  6:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from astrange at ithinksw dot com  2008-06-14 06:48 -------
Created an attachment (id=15771)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15771&action=view)
testcase


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] IRA doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
  2008-06-14  6:49 ` [Bug target/36539] " astrange at ithinksw dot com
@ 2008-06-14  6:53 ` pinskia at gcc dot gnu dot org
  2008-08-27  4:42 ` [Bug target/36539] [4.4 regression] " astrange at ithinksw dot com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-06-14  6:53 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from pinskia at gcc dot gnu dot org  2008-06-14 06:52 -------
IRA has not been committed to the trunk yet so it is not a regression (yet).


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|RA doesn't allocate asm     |IRA doesn't allocate asm
                   |output being returned to eax|output being returned to eax


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] [4.4 regression] IRA doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
  2008-06-14  6:49 ` [Bug target/36539] " astrange at ithinksw dot com
  2008-06-14  6:53 ` [Bug target/36539] " pinskia at gcc dot gnu dot org
@ 2008-08-27  4:42 ` astrange at ithinksw dot com
  2008-08-29  4:43 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: astrange at ithinksw dot com @ 2008-08-27  4:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from astrange at ithinksw dot com  2008-08-27 04:41 -------
Now it is.


-- 

astrange at ithinksw dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|IRA doesn't allocate asm    |[4.4 regression] IRA doesn't
                   |output being returned to eax|allocate asm output being
                   |                            |returned to eax


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] [4.4 regression] IRA doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (2 preceding siblings ...)
  2008-08-27  4:42 ` [Bug target/36539] [4.4 regression] " astrange at ithinksw dot com
@ 2008-08-29  4:43 ` pinskia at gcc dot gnu dot org
  2008-08-29 19:19 ` vmakarov at redhat dot com
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-08-29  4:43 UTC (permalink / raw)
  To: gcc-bugs



-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu dot
                   |                            |org
   Target Milestone|---                         |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] [4.4 regression] IRA doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (3 preceding siblings ...)
  2008-08-29  4:43 ` pinskia at gcc dot gnu dot org
@ 2008-08-29 19:19 ` vmakarov at redhat dot com
  2008-09-04  4:04 ` astrange at ithinksw dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vmakarov at redhat dot com @ 2008-08-29 19:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from vmakarov at redhat dot com  2008-08-29 19:17 -------
I believe that patch

http://gcc.gnu.org/ml/gcc-patches/2008-08/msg02279.html

solves the problem.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] [4.4 regression] IRA doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (4 preceding siblings ...)
  2008-08-29 19:19 ` vmakarov at redhat dot com
@ 2008-09-04  4:04 ` astrange at ithinksw dot com
  2008-09-09 20:15 ` [Bug target/36539] " jsm28 at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: astrange at ithinksw dot com @ 2008-09-04  4:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from astrange at ithinksw dot com  2008-09-04 04:02 -------
It is fixed for me on x86-64. For i386 it's still suboptimal:
_get_cabac:
        subl    $28, %esp
        movl    %esi, 16(%esp)
        movl    %edi, 20(%esp)
        movl    %ebx, 12(%esp)
        movl    %ebp, 24(%esp)
        movl    32(%esp), %esi
        movl    36(%esp), %edi
        movl    (%esi), %eax
        movl    4(%esi), %ebx
# 16 "../cabac-ret.i" 1
        #%ebp %ebx %ax 16(%esi) %edi
# 0 "" 2
        movl    %eax, (%esi)
        movl    %ebx, 4(%esi)
        movl    %ebp, %eax
        movl    12(%esp), %ebx
        andl    $1, %eax
        movl    16(%esp), %esi
        movl    20(%esp), %edi
        movl    24(%esp), %ebp
        addl    $28, %esp
        ret

but not a regression (code is worse without IRA).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] IRA doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (5 preceding siblings ...)
  2008-09-04  4:04 ` astrange at ithinksw dot com
@ 2008-09-09 20:15 ` jsm28 at gcc dot gnu dot org
  2008-09-18  1:31 ` [Bug target/36539] IRA+i386 " astrange at ithinksw dot com
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2008-09-09 20:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from jsm28 at gcc dot gnu dot org  2008-09-09 20:14 -------
Removing regression marker given the last comment.


-- 

jsm28 at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.4 regression] IRA doesn't|IRA doesn't allocate asm
                   |allocate asm output being   |output being returned to eax
                   |returned to eax             |
   Target Milestone|4.4.0                       |---


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] IRA+i386 doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (6 preceding siblings ...)
  2008-09-09 20:15 ` [Bug target/36539] " jsm28 at gcc dot gnu dot org
@ 2008-09-18  1:31 ` astrange at ithinksw dot com
  2008-12-05 20:09 ` astrange at ithinksw dot com
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: astrange at ithinksw dot com @ 2008-09-18  1:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from astrange at ithinksw dot com  2008-09-18 01:29 -------
Updated to 32-bit only.


-- 

astrange at ithinksw dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
 GCC target triplet|x86_64-*-*                  |i?86-*-*
            Summary|IRA doesn't allocate asm    |IRA+i386 doesn't allocate
                   |output being returned to eax|asm output being returned to
                   |                            |eax


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] IRA+i386 doesn't allocate asm output being returned to eax
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (7 preceding siblings ...)
  2008-09-18  1:31 ` [Bug target/36539] IRA+i386 " astrange at ithinksw dot com
@ 2008-12-05 20:09 ` astrange at ithinksw dot com
  2010-01-29 11:01 ` [Bug target/36539] IRA doesn't account for earlyclobber asm conflicts law at redhat dot com
  2010-01-29 20:34 ` [Bug target/36539] Poor register allocation from IRA vmakarov at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: astrange at ithinksw dot com @ 2008-12-05 20:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from astrange at ithinksw dot com  2008-12-05 20:08 -------
With some recent changes IRA makes better decisions now but they don't survive
reload.

Using
> /gcc -O3 -fomit-frame-pointer -fno-pic -fdump-rtl-ira -S cabac-ret.i

I get about the same asm and this in the IRA dump:
**** Allocnos coloring:


  Loop 0 (parent -1, header bb0, depth 0)
    bbs: 2
    all: 0r64 1r58 2r62 3r59 4r60 5r63
    modified regnos: 58 59 60 62 63 64
    border:
    Pressure: GENERAL_REGS=6
    Reg 58 of GENERAL_REGS has 2 regs less
    Reg 62 of GENERAL_REGS has 2 regs less
    Reg 59 of GENERAL_REGS has 2 regs less
    Reg 60 of GENERAL_REGS has 2 regs less
    Reg 63 of GENERAL_REGS has 2 regs less
      Pushing a0(r64,l0)
      Pushing a3(r59,l0)(potential spill: pri=2857, cost=20000)
      Pushing a1(r58,l0)
      Pushing a5(r63,l0)
      Pushing a2(r62,l0)
      Pushing a4(r60,l0)
      Popping a4(r60,l0)  -- assign reg 3
      Popping a2(r62,l0)  -- assign reg 4
      Popping a5(r63,l0)  -- assign reg 0 <- "r"(state)
      Popping a1(r58,l0)  -- assign reg 0 <- "=&r"(bit)
      Popping a3(r59,l0)  -- assign reg 5
      Popping a0(r64,l0)  -- assign reg 0 <- returned bit&1

a1 and a5 should be conflicting, since a1 is an earlyclobber output and can't
share a register with any of the inputs. reload fixes this by moving it to a
worse register. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] IRA doesn't account for earlyclobber asm conflicts
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (8 preceding siblings ...)
  2008-12-05 20:09 ` astrange at ithinksw dot com
@ 2010-01-29 11:01 ` law at redhat dot com
  2010-01-29 20:34 ` [Bug target/36539] Poor register allocation from IRA vmakarov at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: law at redhat dot com @ 2010-01-29 11:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from law at redhat dot com  2010-01-29 11:01 -------
At some point IRA's handling of earlyclobber constraints improved enough to get
the conflicts right in this PR.  Unfortunately, that is not sufficient to
generate good code.  

If we compile the testcase on i686-pc-linux-gnu with -O2 -fomit-frame-pointer,
we should be able to allocate hard regs for all the allocnos *and* do so
without generating any reloads.  Unfortunately, IRA makes poor register
selections which ultimately lead to reloading.

Pass 0 for finding pseudo/allocno costs

    a0 (r68,l0) best AREG, cover GENERAL_REGS
    a4 (r67,l0) best Q_REGS, cover GENERAL_REGS
    a3 (r66,l0) best GENERAL_REGS, cover GENERAL_REGS
    a1 (r65,l0) best GENERAL_REGS, cover GENERAL_REGS
    a5 (r64,l0) best GENERAL_REGS, cover GENERAL_REGS
    a2 (r63,l0) best GENERAL_REGS, cover GENERAL_REGS
    a7 (r59,l0) best GENERAL_REGS, cover GENERAL_REGS
    a6 (r58,l0) best GENERAL_REGS, cover GENERAL_REGS

  a0(r68,l0) costs: AREG:-1000,-1000 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0
DIREG:0,0 AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0
GENERAL_REGS:0,0 MEM:6000
  a1(r65,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:7000
  a2(r63,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:28000
  a3(r66,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:16000
  a4(r67,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:4000,4000
DIREG:4000,4000 AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:4000,4000
GENERAL_REGS:4000,4000 MEM:16000
  a5(r64,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:4000
  a6(r58,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:3000
  a7(r59,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:3000


Pass 1 for finding pseudo/allocno costs

    r68: preferred AREG, alternative GENERAL_REGS, cover GENERAL_REGS
    r67: preferred Q_REGS, alternative GENERAL_REGS, cover GENERAL_REGS
    r66: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS
    r65: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS
    r64: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS
    r63: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS
    r59: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS
    r58: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS

  a0(r68,l0) costs: AREG:-1000,-1000 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0
DIREG:0,0 AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0
GENERAL_REGS:0,0 MEM:6000
  a1(r65,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:7000
  a2(r63,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:28000
  a3(r66,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:16000
  a4(r67,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:4000,4000
DIREG:4000,4000 AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:4000,4000
GENERAL_REGS:4000,4000 MEM:16000
  a5(r64,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:4000
  a6(r58,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:3000
  a7(r59,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 BREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0 GENERAL_REGS:0,0
MEM:3000

That looks fairly reasonable.  pseudo 68 is the return value, so assigning it
into ax is a win.  r67 wants Q_REGS so assigning it to SI, DI, or NON_Q has a
cost.


;; a0(r68,l0) conflicts:
;;     total conflict hard regs:
;;     conflict hard regs:
;; a1(r65,l0) conflicts: a3(r66,l0) a2(r63,l0) a4(r67,l0) a5(r64,l0)
;;     total conflict hard regs: 1 2
;;     conflict hard regs: 1 2
;; a2(r63,l0) conflicts: a1(r65,l0) a3(r66,l0) a4(r67,l0) a5(r64,l0) a6(r58,l0)
a7(r59,l0)
;;     total conflict hard regs: 1 2
;;     conflict hard regs: 1 2
;; a3(r66,l0) conflicts: a1(r65,l0) a2(r63,l0) a4(r67,l0) a5(r64,l0) a6(r58,l0)
;;     total conflict hard regs: 1 2
;;     conflict hard regs: 1 2
;; a4(r67,l0) conflicts: a1(r65,l0) a3(r66,l0) a2(r63,l0) a5(r64,l0)
;;     total conflict hard regs: 1 2
;;     conflict hard regs: 1 2
;; a5(r64,l0) conflicts: a1(r65,l0) a3(r66,l0) a2(r63,l0) a4(r67,l0)
;;     total conflict hard regs: 1 2
;;     conflict hard regs: 1 2
;; a6(r58,l0) conflicts: a3(r66,l0) a2(r63,l0) a7(r59,l0)
;;     total conflict hard regs:
;;     conflict hard regs:
;; a7(r59,l0) conflicts: a2(r63,l0) a6(r58,l0)
;;     total conflict hard regs:
;;     conflict hard regs:

  cp0:a3(r66)<->a7(r59)@1000:move
  cp1:a4(r67)<->a6(r58)@1000:move
  cp2:a0(r68)<->a1(r65)@125:shuffle


The r65/r68 copy will cause IRA to want to use ax for r65 and as a result will
increase the cost of ax for every pseudo which conflicts with r65 (r63, r64,
r66, r67).

So far, so good.  The problem is I don't see where we increase the cost of
Q_REGS for pseudos which conflict with r67.  So when we color:



      Pushing a7(r59,l0)
      Pushing a6(r58,l0)
      Pushing a0(r68,l0)
      Pushing a3(r66,l0)(potential spill: pri=2285, cost=16000)
      Pushing a4(r67,l0)(potential spill: pri=2666, cost=16000)
      Pushing a1(r65,l0)
      Pushing a5(r64,l0)
      Pushing a2(r63,l0)
      Popping a2(r63,l0)  -- assign reg 3
      Popping a5(r64,l0)  -- assign reg 4
      Popping a1(r65,l0)  -- assign reg 0
      Popping a4(r67,l0)  -- assign reg 5
      Popping a3(r66,l0)  -- assign reg 6
      Popping a0(r68,l0)  -- assign reg 0
      Popping a6(r58,l0)  -- assign reg 5
      Popping a7(r59,l0)  -- assign reg 6

We've assigned r63 into bx and thus r67 can't be assigned into bx.  We
ultimately assign r67 into di.  That in turn causes r65 to get spilled during
reload so that r67 can be reloaded into a  Q_REGs and ultimately we generate
crappy code.

It seems to me we ought to have code which increases the cost for Q_REGS for
pseudos conflicting with r67, but I can't seem to find it.  FWIW, if I manually
increase the cost of Q_REGS for pseudos r63, r64, r65, r66) I get the following
coloring:

      Pushing a7(r59,l0)
      Pushing a6(r58,l0)
      Pushing a0(r68,l0)
      Pushing a3(r66,l0)(potential spill: pri=2285, cost=16000)
      Pushing a4(r67,l0)(potential spill: pri=2666, cost=16000)
      Pushing a1(r65,l0)
      Pushing a5(r64,l0)
      Pushing a2(r63,l0)
      Popping a2(r63,l0)  -- assign reg 4
      Popping a5(r64,l0)  -- assign reg 5
      Popping a1(r65,l0)  -- assign reg 0
      Popping a4(r67,l0)  -- assign reg 3
      Popping a3(r66,l0)  -- assign reg 6
      Popping a0(r68,l0)  -- assign reg 0
      Popping a6(r58,l0)  -- assign reg 3
      Popping a7(r59,l0)  -- assign reg 6

Which is a perfect allocation requiring no reloads.  


-- 

law at redhat dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2010-01-29 11:01:15
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/36539] Poor register allocation from IRA
  2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
                   ` (9 preceding siblings ...)
  2010-01-29 11:01 ` [Bug target/36539] IRA doesn't account for earlyclobber asm conflicts law at redhat dot com
@ 2010-01-29 20:34 ` vmakarov at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: vmakarov at redhat dot com @ 2010-01-29 20:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from vmakarov at redhat dot com  2010-01-29 20:33 -------
Jeff, I saw analogous problem when I worked on improving IRA performance.  I
checked the approach you are proposing.  But it works considerably worse on
SPEC2000.  Finally, I found that the best conflicting cost technique works when
we change it only for one hard register when pseudo best cost is achieved on
one hard register, e.g. best cost is achieved on register class containing one
hard register or assigning particular hard register removes a copy.

Why technique you are proposing does not work well in average for classes (like
Q_REGS in this case) containing more one register? This is just my speculation.
 If # conflicting pseudos is less size of QREGS we should not modify conflict
costs of the pseudo for QREGS because QREGS for the conflicting pseudos can be
more profitable and we still will assign QREG for the pseudo.  Even if #
conflicting pseudos > size of QREGS, they still might be assigned to hard
registers which are only part of QREGS.  It is hard to predict.

I am not saying that we should not work on this problem. I think we should try
more sophisticated heuristics.  Although I don't know what one (it could be
conflict cost modifications only when register pressure for QREGS is high
during pseudo live range but such heuristic will take some time to implement
and i am not still sure that it will work better in average).

Unfortunately, there will be cases when RA could work better because RA
algorithms are heuristic ones.  What we should focus on is to improve
performance for credible benchmarks like SPEC2000/SPEC2006.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36539


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-01-29 20:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-14  6:48 [Bug target/36539] New: [4.4 regression] IRA doesn't allocate asm output being returned to eax astrange at ithinksw dot com
2008-06-14  6:49 ` [Bug target/36539] " astrange at ithinksw dot com
2008-06-14  6:53 ` [Bug target/36539] " pinskia at gcc dot gnu dot org
2008-08-27  4:42 ` [Bug target/36539] [4.4 regression] " astrange at ithinksw dot com
2008-08-29  4:43 ` pinskia at gcc dot gnu dot org
2008-08-29 19:19 ` vmakarov at redhat dot com
2008-09-04  4:04 ` astrange at ithinksw dot com
2008-09-09 20:15 ` [Bug target/36539] " jsm28 at gcc dot gnu dot org
2008-09-18  1:31 ` [Bug target/36539] IRA+i386 " astrange at ithinksw dot com
2008-12-05 20:09 ` astrange at ithinksw dot com
2010-01-29 11:01 ` [Bug target/36539] IRA doesn't account for earlyclobber asm conflicts law at redhat dot com
2010-01-29 20:34 ` [Bug target/36539] Poor register allocation from IRA vmakarov at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).