public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/54133] New: regrename introduces additional dependencies
@ 2012-07-31  6:59 amker.cheng at gmail dot com
  2012-07-31  8:01 ` [Bug rtl-optimization/54133] " steven at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: amker.cheng at gmail dot com @ 2012-07-31  6:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

             Bug #: 54133
           Summary: regrename introduces additional dependencies
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: amker.cheng@gmail.com


With test program below:
typedef struct
{
    double X, Y;
} Point;

typedef struct
{
    Point p1;
    Point c1;
    Point c2;
    Point p2;
} Curve;


double bar(double t, double p0, double p1, double p2, double p3);
void foo( Curve *curve, int count )
{
    int n;
    int step;
    Point point;
    Curve c0;
    double t;
    for ( n = 0; n < count; ++n )
    {
        c0 = curve[n];

        for ( step = 0; step < (10); ++step )
        {
            t = ((double)(step)) / (double)(10);
            point.X = bar( t, c0.p1.X, c0.c1.X, c0.c2.X, c0.p2.X );
            point.Y = bar( t, c0.p1.Y, c0.c1.Y, c0.c2.Y, c0.p2.Y );
        }
    }
}

Compiled with command line:
arm-none-eabi-gcc -mthumb -mcpu=cortex-m0 -Os -frename-registers -S

The dump before and after regrenaming are like:
1. before regrename:
(insn 157 80 158 4 (set (reg:SI 4 r4 [180])
        (reg:SI 0 r0)) ../office_pointio.E:29 187 {*thumb1_movsi_insn}
     (expr_list:REG_DEAD (reg:SI 0 r0)
        (nil)))

(insn 158 157 147 4 (set (reg:SI 5 r5 [+4 ])
        (reg:SI 1 r1 [+4 ])) ../office_pointio.E:29 187 {*thumb1_movsi_insn}
     (expr_list:REG_DEAD (reg:SI 1 r1 [+4 ])
        (nil)))

(insn 147 158 83 4 (set (reg:DF 2 r2)
        (mem/c:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 40 [0x28])) [6 %sfp+-56 S8 A64]))
../office_pointio.E:30 205 {*thumb_movdf_insn}
     (nil))

(insn 83 147 148 4 (set (mem:DF (reg/f:SI 13 sp) [0 S8 A64])
        (reg:DF 2 r2)) ../office_pointio.E:30 205 {*thumb_movdf_insn}
     (expr_list:REG_DEAD (reg:DF 2 r2)
        (nil)))

(insn 148 83 84 4 (set (reg:DF 2 r2)
        (mem/c:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 56 [0x38])) [6 %sfp+-40 S8 A64]))
../office_pointio.E:30 205 {*thumb_movdf_insn}
     (nil))

(insn 84 148 149 4 (set (mem:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 8 [0x8])) [0 S8 A64])
        (reg:DF 2 r2)) ../office_pointio.E:30 205 {*thumb_movdf_insn}
     (expr_list:REG_DEAD (reg:DF 2 r2)
        (nil)))

(insn 149 84 85 4 (set (reg:DF 2 r2)
        (mem/c:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 72 [0x48])) [6 %sfp+-24 S8 A64]))
../office_pointio.E:30 205 {*thumb_movdf_insn}
     (nil))

(insn 85 149 159 4 (set (mem:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 16 [0x10])) [0 S8 A64])
        (reg:DF 2 r2)) ../office_pointio.E:30 205 {*thumb_movdf_insn}
     (expr_list:REG_DEAD (reg:DF 2 r2)
        (nil)))

(insn 159 85 160 4 (set (reg:SI 0 r0)
        (reg:SI 4 r4 [180])) ../office_pointio.E:30 187 {*thumb1_movsi_insn}
     (nil))

(insn 160 159 87 4 (set (reg:SI 1 r1 [+4 ])
        (reg:SI 5 r5 [+4 ])) ../office_pointio.E:30 187 {*thumb1_movsi_insn}
     (nil))

2. after regrename:
(insn 157 80 158 4 (set (reg:SI 4 r4 [180])
        (reg:SI 0 r0)) ../office_pointio.E:29 187 {*thumb1_movsi_insn}
     (expr_list:REG_DEAD (reg:SI 0 r0)
        (nil)))

(insn 158 157 147 4 (set (reg:SI 5 r5 [+4 ])
        (reg:SI 1 r1 [+4 ])) ../office_pointio.E:29 187 {*thumb1_movsi_insn}
     (expr_list:REG_DEAD (reg:SI 1 r1 [+4 ])
        (nil)))

(insn 147 158 83 4 (set (reg:DF 0 r0)
        (mem/c:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 40 [0x28])) [6 %sfp+-56 S8 A64]))
../office_pointio.E:30 205 {*thumb_movdf_insn}
     (nil))

(insn 83 147 148 4 (set (mem:DF (reg/f:SI 13 sp) [0 S8 A64])
        (reg:DF 0 r0)) ../office_pointio.E:30 205 {*thumb_movdf_insn}
     (expr_list:REG_DEAD (reg:DF 2 r2)
        (nil)))

(insn 148 83 84 4 (set (reg:DF 2 r2)
        (mem/c:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 56 [0x38])) [6 %sfp+-40 S8 A64]))
../office_pointio.E:30 205 {*thumb_movdf_insn}
     (nil))

(insn 84 148 149 4 (set (mem:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 8 [0x8])) [0 S8 A64])
        (reg:DF 2 r2)) ../office_pointio.E:30 205 {*thumb_movdf_insn}
     (expr_list:REG_DEAD (reg:DF 2 r2)
        (nil)))

(insn 149 84 85 4 (set (reg:DF 1 r1)
        (mem/c:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 72 [0x48])) [6 %sfp+-24 S8 A64]))
../office_pointio.E:30 205 {*thumb_movdf_insn}
     (nil))

(insn 85 149 159 4 (set (mem:DF (plus:SI (reg/f:SI 13 sp)
                (const_int 16 [0x10])) [0 S8 A64])
        (reg:DF 1 r1)) ../office_pointio.E:30 205 {*thumb_movdf_insn}
     (expr_list:REG_DEAD (reg:DF 2 r2)
        (nil)))

(insn 159 85 160 4 (set (reg:SI 0 r0)
        (reg:SI 4 r4 [180])) ../office_pointio.E:30 187 {*thumb1_movsi_insn}
     (nil))

(insn 160 159 87 4 (set (reg:SI 1 r1 [+4 ])
        (reg:SI 5 r5 [+4 ])) ../office_pointio.E:30 187 {*thumb1_movsi_insn}
     (nil))

renaming of r2 in chain(147/83) and chain(149/85) modified r0/r1, prevents insn
159/160 from deleting by hardreg_cprop/dce.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
@ 2012-07-31  8:01 ` steven at gcc dot gnu.org
  2012-08-01  7:50 ` amker.cheng at gmail dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: steven at gcc dot gnu.org @ 2012-07-31  8:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-07-31
                 CC|                            |steven at gcc dot gnu.org
     Ever Confirmed|0                           |1

--- Comment #1 from Steven Bosscher <steven at gcc dot gnu.org> 2012-07-31 08:00:38 UTC ---
Confirmed.
(You can try -fdump-rtl-all-slim for dumps that are easier to interpret.)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
  2012-07-31  8:01 ` [Bug rtl-optimization/54133] " steven at gcc dot gnu.org
@ 2012-08-01  7:50 ` amker.cheng at gmail dot com
  2012-08-01 10:14 ` steven at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: amker.cheng at gmail dot com @ 2012-08-01  7:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

--- Comment #2 from amker.cheng <amker.cheng at gmail dot com> 2012-08-01 07:49:51 UTC ---
I measured this kind of regression in benchmark CSiBE on
arm-none-eabi/cortex-m0 with Os optimization. Turns out most of the them are
relate to paramter/return register moving, like the reported case.

The logic is:
STEP1: At prologue or after call_insn, gcc saves parameter(or return) registers
in pseudos, then load it from the pseudo when need to use it(like calling
another function with the paramter).
For example:
{
  rx <- r0
  ...
  ...
  r0 <- rx
  call another function
}

If instructions between saving and using do not clobber paramter register, the
hard register can be propagated to remove one redundant move instruction.

STEP2: copy propagation before IRA just ignore hard registers, so usually these
can only be done in regcprop.c after IRA.

BUT,
STEP3: register renaming does not honor any propagation opportunities and may
using r0 to rename, which introduces additional dependencies. It's a common
regression because regrename always select renaming register from 0 to
FIRST_PSEUOD_REG.


In experiment, if I disable r0/r1 from renaming, most regressions observed in
CSiBE are gone.

So how should this be fixed? Thanks.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
  2012-07-31  8:01 ` [Bug rtl-optimization/54133] " steven at gcc dot gnu.org
  2012-08-01  7:50 ` amker.cheng at gmail dot com
@ 2012-08-01 10:14 ` steven at gcc dot gnu.org
  2012-08-01 11:58 ` steven at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: steven at gcc dot gnu.org @ 2012-08-01 10:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

--- Comment #3 from Steven Bosscher <steven at gcc dot gnu.org> 2012-08-01 10:13:32 UTC ---
With "GCC: (GNU) 4.8.0 20120731 (experimental) [trunk revision 190015]" the
dumps look slightly different. I'm using the -fdump-rtl-all-slim dumps (with a
local patch to dump SEQUENCEs also) and have this dump, showing the same
problem:

  BEFORE REGRENAME (.207r.ce3)   ----> AFTER REGRENAME (.208r.rnreg)
  154 r4:SI=r0:SI                  =   154 r4:SI=r0:SI
      REG_DEAD: r0:SI              =       REG_DEAD: r0:SI
  155 r5:SI=r1:SI                  =   155 r5:SI=r1:SI
      REG_DEAD: r1:SI              =       REG_DEAD: r1:SI
  144 r2:DF=[sp:SI+0x28]           |   144 r0:DF=[sp:SI+0x28]
   80 [sp:SI]=r2:DF                |    80 [sp:SI]=r0:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  145 r2:DF=[sp:SI+0x38]           =   145 r2:DF=[sp:SI+0x38]
   81 [sp:SI+0x8]=r2:DF            =    81 [sp:SI+0x8]=r2:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  146 r2:DF=[sp:SI+0x48]           |   146 r1:DF=[sp:SI+0x48]
   82 [sp:SI+0x10]=r2:DF           |    82 [sp:SI+0x10]=r1:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  156 r0:SI=r4:SI                  =   156 r0:SI=r4:SI
  157 r1:SI=r5:SI                  =   157 r1:SI=r5:SI
   84 r2:DF=[sp:SI+0x18]           =    84 r2:DF=[sp:SI+0x18]
   85 r0:DF=call [`bar'] argc:0x18 =    85 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
      REG_UNUSED: r0:DF            =       REG_UNUSED: r0:DF
  147 r2:DF=[sp:SI+0x30]           |   147 r0:DF=[sp:SI+0x30]
   86 [sp:SI]=r2:DF                |    86 [sp:SI]=r0:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  148 r2:DF=[sp:SI+0x40]           =   148 r2:DF=[sp:SI+0x40]
   87 [sp:SI+0x8]=r2:DF            =    87 [sp:SI+0x8]=r2:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  149 r2:DF=[sp:SI+0x50]           |   149 r1:DF=[sp:SI+0x50]
   88 [sp:SI+0x10]=r2:DF           |    88 [sp:SI+0x10]=r1:DF
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
  158 r0:SI=r4:SI                  =   158 r0:SI=r4:SI
      REG_DEAD: r4:SI              =       REG_DEAD: r4:SI
  159 r1:SI=r5:SI                  =   159 r1:SI=r5:SI
      REG_DEAD: r5:SI              =       REG_DEAD: r5:SI
   90 r2:DF=[sp:SI+0x20]           =    90 r2:DF=[sp:SI+0x20]
   91 r0:DF=call [`bar'] argc:0x18 =    91 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF              =       REG_DEAD: r2:DF
      REG_UNUSED: r0:DF            =       REG_UNUSED: r0:DF

The sets of r4 and r5 are actually not used for anything other than
preserving/reloading the values of r0 and r1 for the second call to bar. To
understand how the sets of r4 and r5 come into existence to begin with, we need
to look at the pre-regalloc dumps, e.g. the .191r.asmcons dump:

   77 r0:DF=call [`__aeabi_ddiv'] argc:0
      REG_DEAD: r2:DF
      REG_EH_REGION: 0xffffffff80000000
   78 r177:DF=r0:DF
      REG_DEAD: r0:DF
   80 [sp:SI]=r166:DF
   81 [sp:SI+0x8]=r168:DF
   82 [sp:SI+0x10]=r170:DF
   83 r0:DF=r177:DF
   84 r2:DF=r164:DF
   85 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF
      REG_UNUSED: r0:DF
   86 [sp:SI]=r167:DF
   87 [sp:SI+0x8]=r169:DF
   88 [sp:SI+0x10]=r171:DF
   89 r0:DF=r177:DF
      REG_DEAD: r177:DF

In insn 78, r177 is used to memoize the call result from __aeabi_ddiv (which is
the variable "t" in the source code). IRA goes to work on this and finds:

Reg 177: local to bb 4 def dominates all uses has unique first use
Found def insn 78 for 177 to be not moveable
(insn 78 is not moveable because a hard register is involved in the SET_SRC)
   Insn 78(l0): point = 37
   Insn 89(l0): point = 17
   Insn 88(l0): point = 19

;; a5(r177,l0) conflicts:
;;   subobject 0: a1(r159,l0) a2(r172,l0) a3(r163,l0) a4(r165,w0,l0)
a4(r165,w1,l0) a6(r171,w0,l0) a6(r171,w1,l0) a7(r169,w0,l0) a7(r169,w1,l0)
a8(r167,w0,l0) a8(r167,w1,l0) a9(r164,w0,l0) a9(r164,w1,l0) a10(r170,w0,l0)
a10(r170,w1,l0) a11(r168,w0,l0) a11(r168,w1,l0) a12(r166,w0,l0) a12(r166,w1,l0)
a0(r175,l0)
;;     total conflict hard regs: 0-3 12
;;     conflict hard regs: 0-3 12

So r177 conflicts with r0 and can't be coalesced, and r177 ends up allocated to
the first available hard register, which is r4. In .198r.split2, the DFmode set
in insn 78 is split to set r4 and r5.

This issue can in theory be fixed before reload: You'd have to copy-propagate
the hard-register set of r177 in insn 78 to its use in insn 83. There is a risk
that this won't work in general because you can't know before reload whether r0
will be needed for reloads in the interlying insns and you may end up
increasing register pressure and spoiling the code. Therefore you'd want to
propagate as late as possible. That would be the regmove pass.

I'm trying something, will post later today...


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
                   ` (2 preceding siblings ...)
  2012-08-01 10:14 ` steven at gcc dot gnu.org
@ 2012-08-01 11:58 ` steven at gcc dot gnu.org
  2012-08-01 13:49 ` amker.cheng at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: steven at gcc dot gnu.org @ 2012-08-01 11:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

--- Comment #4 from Steven Bosscher <steven at gcc dot gnu.org> 2012-08-01 11:58:00 UTC ---
Created attachment 27918
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27918
Hack regmove to do limited propagation of hard regs

I have a patch to make the propagation happen in regmove, resulting in the
following RTL in the IRA dump:

   78 r177:DF=r0:DF
   80 [sp:SI]=r166:DF
   81 [sp:SI+0x8]=r168:DF
   82 [sp:SI+0x10]=r170:DF
   84 r2:DF=r164:DF
   85 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF
      REG_UNUSED: r0:DF
   86 [sp:SI]=r167:DF
   87 [sp:SI+0x8]=r169:DF
   88 [sp:SI+0x10]=r171:DF
   89 r0:DF=r177:DF
      REG_DEAD: r177:DF
   90 r2:DF=r165:DF
   91 r0:DF=call [`bar'] argc:0x18

But now IRA decides to spill r177 to memory:

      Allocno a5r177 of GENERAL_REGS(9) has 4 avail. regs  4-7, node:  4-7 obj
0 (confl regs =  0-3 8-102),  obj 1 (confl regs =  0-3 8-102)

      Pushing a5(r177,l0)(potential spill: pri=469, cost=38000)
        Making a1(r159,l0) colorable
        Making a2(r172,l0) colorable
        Making a3(r163,l0) colorable
      Pushing a1(r159,l0)(cost 32000)
      Pushing a3(r163,l0)(cost 40000)
      Pushing a2(r172,l0)(cost 88000)
      Popping a2(r172,l0)  -- assign reg 4
      Popping a3(r163,l0)  -- assign reg 5
      Popping a1(r159,l0)  -- assign reg 6
      Popping a5(r177,l0)  -- (memory is more profitable 32000 vs 2147483647)
spill
      Popping a0(r175,l0)  -- assign reg 7

This makes the inner loop 2 instructions smaller, but I haven't investigated
whether that's an improvement or not.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
                   ` (3 preceding siblings ...)
  2012-08-01 11:58 ` steven at gcc dot gnu.org
@ 2012-08-01 13:49 ` amker.cheng at gmail dot com
  2012-08-02  7:22 ` ebotcazou at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: amker.cheng at gmail dot com @ 2012-08-01 13:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

--- Comment #5 from amker.cheng <amker.cheng at gmail dot com> 2012-08-01 13:48:50 UTC ---
Thanks for your patch, IMHO, I don't think the problem could be fixed in this
way, because:
1. 
   78 r177:DF=r0:DF
   80 [sp:SI]=r166:DF
   81 [sp:SI+0x8]=r168:DF
   82 [sp:SI+0x10]=r170:DF
   84 r2:DF=r164:DF
   85 r0:DF=call [`bar'] argc:0x18
      REG_DEAD: r2:DF
      REG_UNUSED: r0:DF
   86 [sp:SI]=r167:DF
   87 [sp:SI+0x8]=r169:DF
   88 [sp:SI+0x10]=r171:DF
   89 r0:DF=r177:DF
      REG_DEAD: r177:DF
   90 r2:DF=r165:DF
   91 r0:DF=call [`bar'] argc:0x18

The propagation actually increases register pressure from insn 78 to insn 85,
since r177 and r0 are both alive now.
Maybe IRA makes a better decision in this case by spilling r177, I double the
common results.

2.The reported case is some kind of special with all related insns limited in
one basic block. In other cases like described in comment 2, the saving of hard
register is in prologue, so the propagation crosses basic blocks.

Anyway, one thing is clear that the problem is closely connected with
parameter/return register moving.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
                   ` (4 preceding siblings ...)
  2012-08-01 13:49 ` amker.cheng at gmail dot com
@ 2012-08-02  7:22 ` ebotcazou at gcc dot gnu.org
  2012-08-02 10:18 ` amker.cheng at gmail dot com
  2012-09-25  7:45 ` amker.cheng at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2012-08-02  7:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot
                   |                            |gnu.org

--- Comment #6 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2012-08-02 07:21:50 UTC ---
> In experiment, if I disable r0/r1 from renaming, most regressions observed in
> CSiBE are gone.
> 
> So how should this be fixed? Thanks.

The choice of the renaming register can be parameterized at the class level,
but I'm not sure this would work here.  You could also try to add some
additional heuristics for this choice, as it seems to be clearly
counter-productive here.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
                   ` (5 preceding siblings ...)
  2012-08-02  7:22 ` ebotcazou at gcc dot gnu.org
@ 2012-08-02 10:18 ` amker.cheng at gmail dot com
  2012-09-25  7:45 ` amker.cheng at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: amker.cheng at gmail dot com @ 2012-08-02 10:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

--- Comment #7 from amker.cheng <amker.cheng at gmail dot com> 2012-08-02 10:18:41 UTC ---
(In reply to comment #6)
> > In experiment, if I disable r0/r1 from renaming, most regressions observed in
> > CSiBE are gone.
> > 
> > So how should this be fixed? Thanks.
> 
> The choice of the renaming register can be parameterized at the class level,
> but I'm not sure this would work here.  You could also try to add some
> additional heuristics for this choice, as it seems to be clearly
> counter-productive here.

My bad that I did not mention details of the method by disabling r0/r1 from
renaming.
When comparing to trunk(where regrename is disabled for Os), the method fixes
most of regrenaming regressions, which is good.
But it is too conservertive that some renaming opportunities are missed. From
the view of code size: data show that this method has 700/440 bytes
benefit/regression against the current implemention of regrename. This means
only 250 bytes benefit overall.
The data is collected from CSiBE on arm cortex-m0.

Giving that the regressions may cross basic_block, it's hard to fix them in
regrenaming without missing renaming opportunities.

Is it possible to run regcprop pass both before and after regrenaming?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/54133] regrename introduces additional dependencies
  2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
                   ` (6 preceding siblings ...)
  2012-08-02 10:18 ` amker.cheng at gmail dot com
@ 2012-09-25  7:45 ` amker.cheng at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: amker.cheng at gmail dot com @ 2012-09-25  7:45 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54133

--- Comment #8 from amker.cheng <amker.cheng at gmail dot com> 2012-09-25 07:45:02 UTC ---
I have spent some time investigating this bug and now I think I understand the
issue.

The problematic instruction patterns which save/restore argument/return
registers is generated/kept on Thumb1 because ARM back end defines target hook
TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P. 

The intention is to keep live range of hardware registers short, so I think it
is inappropriate to do the propagation before IRA.

I can only think about fixing this in following ways:
1. run an additional cprop_hardreg before register renaming. Of course this
seems not decent.
2. post reload pass supports simple CSE by using cselib, we can do the
transformation in postreload.

Currently CSELIB can't detect such cases. Root cause is:
1. argument registers usually have no initialization; return register usually
initialized by call_expr.
2. CSELIB uses the first element of the elt_list defines the mode in which the
register was set; if the mode is unknown or the value is no longer valid in
that mode, ELT will be NULL for the first element.
3. CSELIB creates first NULL elt_list for argument registers in function
"cselib_lookup_1", because such registers has no initialization.
4. CSELIB ignores return registers initialized by call_expr, as in function
"cselib_hash_rtx". Then create first NULL elt_list for return registers.
5. In function "cselib_reg_set_mode", CSELIB checks whether the first element
of elt_list is NULL, this results in argument/return register won't be CSEd.

But I am not sure whether CSELIB can be improved to address such issue.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-09-25  7:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-31  6:59 [Bug rtl-optimization/54133] New: regrename introduces additional dependencies amker.cheng at gmail dot com
2012-07-31  8:01 ` [Bug rtl-optimization/54133] " steven at gcc dot gnu.org
2012-08-01  7:50 ` amker.cheng at gmail dot com
2012-08-01 10:14 ` steven at gcc dot gnu.org
2012-08-01 11:58 ` steven at gcc dot gnu.org
2012-08-01 13:49 ` amker.cheng at gmail dot com
2012-08-02  7:22 ` ebotcazou at gcc dot gnu.org
2012-08-02 10:18 ` amker.cheng at gmail dot com
2012-09-25  7:45 ` amker.cheng at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).