public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0
@ 2012-08-30 22:26 olegendo at gcc dot gnu.org
  2012-08-31 10:59 ` [Bug target/54429] " kkojima at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-30 22:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

             Bug #: 54429
           Summary: [SH] SImode values get ferried through FPUL and FP
                    regs for -O0
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: olegendo@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
                CC: kkojima@gcc.gnu.org
            Target: sh*-*-*


I've noticed that, for some reason, when compiling code with -O0 SImode values
get ferried through FPUL reg and FP regs, like .. 

        lds     r1,fpul
        fsts    fpul,fr1
        mov     r7,r4
        flds    fr1,fpul
        sts     fpul,r5

.. which is really just:
        mov     r1,r5
        mov     r7,r4

In sh_hard_regno_mode_ok, I've tried doing:

Index: gcc/config/sh/sh.c
===================================================================
--- gcc/config/sh/sh.c    (revision 190780)
+++ gcc/config/sh/sh.c    (working copy)
@@ -11706,6 +11706,9 @@
   if (FP_REGISTER_P (regno) && mode == SFmode)
     return true;

+  if (FP_REGISTER_P (regno) && GET_MODE_CLASS (mode) != MODE_FLOAT)
+    return false;
+
   if (mode == V2SFmode)
     {
       if (((FP_REGISTER_P (regno) && (regno - FIRST_FP_REG) % 2 == 0)

.. and it makes the confusing FP reg usage go away.  Somehow, I don't
understand why it would make sense to allow any integer modes in FP regs on SH.
 No useful integer operations can be done with integer values in FP regs
anyway.  The only purpose that I see, is to allow spilling of GP regs to FP
regs, which doesn't seem beneficial.

I've checked CSiBE results with the fix above.  It causes a few increases and
decreases in the teem lib.  The increases are, because GP regs are not spilled
to FP regs anymore and have to go to the stack, which causes some unlucky
displacement address changes.  The decreases are because of the same, but where
displacement addresses are lucky.  Also, weird cases disappear where FP regs
are pushed/popped in a function but are otherwise unused.

Kaz, do you happen to know why the sh_hard_regno_mode_ok function is the way it
is?  I think The FP reg check after the V16SFmode check ...

if (FP_REGISTER_P (regno))
    {
      if (mode == SFmode
      || mode == SImode
     ....

would also become obsolete if the change above was applied?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
@ 2012-08-31 10:59 ` kkojima at gcc dot gnu.org
  2012-08-31 11:54 ` olegendo at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-08-31 10:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #1 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-08-31 10:59:44 UTC ---
(In reply to comment #0)

I don't know the history about it.  I guess that the original
intention would be to use FP registers as fast memories for
integers, though I'm wrong about it.  BTW, the last time I
tried to disable integer mode for FP registers, some reload
failures popped up in gcc testsuite.  Are there no new failures
on your test with the above patch?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
  2012-08-31 10:59 ` [Bug target/54429] " kkojima at gcc dot gnu.org
@ 2012-08-31 11:54 ` olegendo at gcc dot gnu.org
  2012-11-13  8:45 ` olegendo at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-31 11:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-31 11:54:27 UTC ---
(In reply to comment #1)
> I don't know the history about it.  I guess that the original
> intention would be to use FP registers as fast memories for
> integers, though I'm wrong about it.  BTW, the last time I
> tried to disable integer mode for FP registers, some reload
> failures popped up in gcc testsuite.  Are there no new failures
> on your test with the above patch?

I haven't done proper testing.  I will check out what's happening there.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
  2012-08-31 10:59 ` [Bug target/54429] " kkojima at gcc dot gnu.org
  2012-08-31 11:54 ` olegendo at gcc dot gnu.org
@ 2012-11-13  8:45 ` olegendo at gcc dot gnu.org
  2012-11-13 22:25 ` olegendo at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-13  8:45 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-13 08:44:43 UTC ---
I've tested this:

Index: gcc/config/sh/sh.c
===================================================================
--- gcc/config/sh/sh.c    (revision 193423)
+++ gcc/config/sh/sh.c    (working copy)
@@ -12113,6 +12113,11 @@
   if (FP_REGISTER_P (regno) && mode == SFmode)
     return true;

+  if (FP_REGISTER_P (regno)
+      && !(GET_MODE_CLASS (mode) == MODE_FLOAT
+       || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT))
+    return false;
+
   if (mode == V2SFmode)
     {
       if (((FP_REGISTER_P (regno) && (regno - FIRST_FP_REG) % 2 == 0)


on rev 193423.  There are a few failures on targets with HW FPU:

FAIL: gcc.c-torture/execute/20080502-1.c compilation
FAIL: gcc.c-torture/execute/ieee/copysign1.c compilation
FAIL: gcc.dg/builtins-32.c (internal compiler error)
FAIL: gcc.dg/builtins-50.c (internal compiler error)
FAIL: gcc.dg/pr48335-7.c (internal compiler error)

I'll check out the details.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-11-13  8:45 ` olegendo at gcc dot gnu.org
@ 2012-11-13 22:25 ` olegendo at gcc dot gnu.org
  2012-11-14  9:45 ` olegendo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-13 22:25 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #4 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-13 22:25:30 UTC ---
(In reply to comment #3)
> I've tested this:
> 
> Index: gcc/config/sh/sh.c
> ===================================================================
> --- gcc/config/sh/sh.c    (revision 193423)
> +++ gcc/config/sh/sh.c    (working copy)
> @@ -12113,6 +12113,11 @@
>    if (FP_REGISTER_P (regno) && mode == SFmode)
>      return true;
> 
> +  if (FP_REGISTER_P (regno)
> +      && !(GET_MODE_CLASS (mode) == MODE_FLOAT
> +       || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT))
> +    return false;
> +
>    if (mode == V2SFmode)
>      {
>        if (((FP_REGISTER_P (regno) && (regno - FIRST_FP_REG) % 2 == 0)
> 
> 
> on rev 193423.  There are a few failures on targets with HW FPU:
> 

It seems these problems happen on big endian targets only.

> FAIL: gcc.c-torture/execute/20080502-1.c compilation

Reload failure.  Problematic insn:

(insn 12 44 13 2 (set (reg:SI 147 t)
        (eq:SI (and:SI (reg:SI 1 r1 [166])
                (subreg:SI (reg:DF 68 fr4 [ x ]) 0))
            (const_int 0 [0]))) sh_tmp.cpp:110 1 {tstsi_t}
     (expr_list:REG_DEAD (reg:SI 1 r1 [166])
        (expr_list:REG_DEAD (reg:DF 68 fr4 [ x ])
            (nil))))

> FAIL: gcc.c-torture/execute/ieee/copysign1.c compilation

Reload failure.  Problematic insn:

(insn 10 41 11 2 (set (reg:SI 147 t)
        (eq:SI (and:SI (reg:SI 1 r1 [165])
                (subreg:SI (reg:DF 70 fr6 [ y ]) 0))
            (const_int 0 [0]))) sh_tmp.cpp:67 1 {tstsi_t}
     (expr_list:REG_DEAD (reg:SI 1 r1 [165])
        (expr_list:REG_DEAD (reg:DF 70 fr6 [ y ])
            (nil))))

> FAIL: gcc.dg/builtins-32.c (internal compiler error)

Reload failure.  Problematic insn:

(insn 8 7 21 2 (set (reg:SI 0 r0 [164])
        (and:SI (reg:SI 0 r0 [164])
            (subreg:SI (reg:DF 68 fr4 [ x ]) 0))) sh_tmp.cpp:30 111
{*andsi_compact}
     (expr_list:REG_DEAD (reg:DF 68 fr4 [ x ])
        (nil)))

> FAIL: gcc.dg/builtins-50.c (internal compiler error)

Reload failure.  Problematic insn:

(insn 10 43 11 2 (set (reg:SI 147 t)
        (eq:SI (and:SI (reg:SI 1 r1 [165])
                (subreg:SI (reg:DF 70 fr6 [ y ]) 0))
            (const_int 0 [0]))) sh_tmp.cpp:24 1 {tstsi_t}
     (expr_list:REG_DEAD (reg:SI 1 r1 [165])
        (expr_list:REG_DEAD (reg:DF 70 fr6 [ y ])
            (nil))))


> FAIL: gcc.dg/pr48335-7.c (internal compiler error)

Reload failure.  Problematic insn:

(insn 9 3 10 2 (set (reg:SI 0 r0 [167])
        (ashift:SI (subreg:SI (reg:DF 68 fr4 [ x ]) 0)
            (const_int 8 [0x8]))) sh_tmp.cpp:28 149 {ashlsi3_k}
     (expr_list:REG_DEAD (reg:DF 68 fr4 [ x ])
        (nil)))


The problem is that 'arith_reg_operand' matches subregs of FP modes and so, for
example, combine folds insn sequences such as

(insn 8 7 9 2 (set (reg:SI 166)
        (subreg:SI (reg:DI 165) 0)) sh_tmp.cpp:33 -1
     (nil))

(insn 9 8 10 2 (set (reg:SI 167)
        (ashift:SI (reg:SI 166)
            (const_int 8 [0x8]))) sh_tmp.cpp:33 -1
     (nil))


Adding this:

Index: gcc/config/sh/predicates.md
===================================================================
--- gcc/config/sh/predicates.md    (revision 193423)
+++ gcc/config/sh/predicates.md    (working copy)
@@ -156,7 +156,12 @@
       if (REG_P (op))
     regno = REGNO (op);
       else if (GET_CODE (op) == SUBREG && REG_P (SUBREG_REG (op)))
-    regno = REGNO (SUBREG_REG (op));
+    {
+      regno = REGNO (SUBREG_REG (op));
+      if (!(GET_MODE_CLASS (GET_MODE (SUBREG_REG (op))) == MODE_INT
+            || GET_MODE_CLASS (GET_MODE (SUBREG_REG (op))) ==
MODE_VECTOR_INT))
+        return false;
+    } 
       else
     return 1;


makes the unwanted subreg propagation go away, but ends up in another reload
trouble:

sh_tmp.cpp:92:1: error: unable to find a register to spill in class
'TARGET_REGS'
 }
 ^
sh_tmp.cpp:92:1: error: this is the insn:
(insn 7 4 8 2 (set (reg:SI 1 r1 [165])
        (subreg:SI (reg:DF 70 fr6 [ y ]) 0)) sh_tmp.cpp:91 244 {movsi_ie}
     (expr_list:REG_DEAD (reg:DF 70 fr6 [ y ])
        (nil)))

On little endian this problem does not happen and the same insn right before
the reload pass looks like:

(insn 7 4 8 2 (set (reg:SI 165)
        (subreg:SI (reg/v:DF 163 [ y ]) 4)) sh_tmp.cpp:91 244 {movsi_ie}
     (expr_list:REG_DEAD (reg/v:DF 163 [ y ])
        (nil)))

Notice that on big endian the insn contains hard regs.
I've tried tapping sh_secondary_reload and there are some weird things
happening such as:

in = 1  rlcass = 8  mode = SI  x = (reg:DF 70 fr6 [ y ])

 --> rclass = 5

This is caused by the way sh_cannot_change_mode_class handles stuff on big
endian.  Adding this to 'sh_cannot_change_mode_class':

  if (from == DFmode && to == SImode)
    return true;

fixes at least one of the test cases, but I'm totally unaware of the
consequences.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-11-13 22:25 ` olegendo at gcc dot gnu.org
@ 2012-11-14  9:45 ` olegendo at gcc dot gnu.org
  2014-10-12 17:28 ` olegendo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-14  9:45 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #5 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-14 09:45:22 UTC ---
(In reply to comment #4)
> 
> makes the unwanted subreg propagation go away, but ends up in another reload
> trouble:
> 
> sh_tmp.cpp:92:1: error: unable to find a register to spill in class
> 'TARGET_REGS'
>  }
>  ^
> sh_tmp.cpp:92:1: error: this is the insn:
> (insn 7 4 8 2 (set (reg:SI 1 r1 [165])
>         (subreg:SI (reg:DF 70 fr6 [ y ]) 0)) sh_tmp.cpp:91 244 {movsi_ie}
>      (expr_list:REG_DEAD (reg:DF 70 fr6 [ y ])
>         (nil)))
> 

The test case I was using for tracking this:

extern double copysign(double,double);
double test1(double x, double y)
{
  return copysign(-x,y);
}

Interestingly, without any patches applied, compiling the function above with
-O2 -m4 -ml results in:

_test1:
        fmov.s  fr6,@-r15 ! 49    movsf_ie/7    [length = 2]
        fmov.s  fr7,@-r15 ! 50    movsf_ie/7    [length = 2]
        mov.l   .L8,r1    ! 43    movsi_ie/1    [length = 2]
        mov.l   @r15+,r2  ! 51    movdf_k/3    [length = 4]
        mov.l   @r15+,r3
        tst     r3,r1     ! 10    tstsi_t/2    [length = 2]
        fmov    fr4,fr0   ! 52    movsf_ie/1    [length = 2]
        fmov    fr5,fr1   ! 53    movsf_ie/1    [length = 2]
        bt/s    .L6       ! 11    *cbranch_t    [length = 2]
        fabs    dr0       ! 9    absdf2_i    [length = 2]
        fneg    dr0       ! 12    negdf2_i    [length = 2]
.L6:
        rts
        nop               ! 56    *return_i    [length = 4]
.L9:
        .align 2
.L8:
        .long   -2147483648


and -O2 -m4 -mb is:

_test1:
        flds    fr6,fpul  ! 46    movsi_ie/22    [length = 2]
        sts     fpul,r1   ! 47    movsi_ie/20    [length = 2]
        fmov    fr4,fr0   ! 48    movsf_ie/1    [length = 2]
        fmov    fr5,fr1   ! 49    movsf_ie/1    [length = 2]
        cmp/pz  r1        ! 10    cmpgesi_t/1    [length = 2]
        bt/s    .L6       ! 11    *cbranch_t    [length = 2]
        fabs    dr0       ! 9    absdf2_i    [length = 2]
        fneg    dr0       ! 12    negdf2_i    [length = 2]
.L6:
        rts
        nop               ! 52    *return_i    [length = 4]

This is probably due to the 'sh_cannot_change_mode_class'.  On big endian the
DFmode reg can be accessed as SImode, but not on little endian because of the
subreg ordering.  Maybe it would be better to not do all the GP <-> FPUL <-> FP
reg transfers during reload but emit the necessary insn sequences before
reload.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-11-14  9:45 ` olegendo at gcc dot gnu.org
@ 2014-10-12 17:28 ` olegendo at gcc dot gnu.org
  2014-12-27 11:41 ` olegendo at gcc dot gnu.org
  2014-12-27 13:56 ` olegendo at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-10-12 17:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #6 from Oleg Endo <olegendo at gcc dot gnu.org> ---
A test case for this problem is gcc/testsuite/g++.dg/tls/thread_local-order1.C,
which is compiled without optimizations and contains the following sequence:

        stc     gbr,r1
        mov.l   .L20,r2
        add     r2,r1
        lds     r1,fpul
        fsts    fpul,fr1
        flds    fr1,fpul
        sts     fpul,r0
        mov     r14,r15
        lds.l   @r15+,pr
        mov.l   @r15+,r14
        rts
        nop

what the code is actually doing:
        stc     gbr,r1
        mov.l   .L20,r2
        add     r2,r1
        mov     r1,r0
        mov     r14,r15
        lds.l   @r15+,pr
        mov.l   @r15+,r14
        rts
        nop


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2014-10-12 17:28 ` olegendo at gcc dot gnu.org
@ 2014-12-27 11:41 ` olegendo at gcc dot gnu.org
  2014-12-27 13:56 ` olegendo at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-27 11:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #7 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Another minimal test case:

int var;

int test (void)
{
  return var;
}


        mov.l   r14,@-r15
        mov     r15,r14
        mov.l   .L3,r1
        mov.l   @r1,r1
        lds     r1,fpul
        fsts    fpul,fr1
        flds    fr1,fpul
        sts     fpul,r0
        mov     r14,r15
        mov.l   @r15+,r14
        rts
        nop
.L4:
        .align 2
.L3:
        .long   _var


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/54429] [SH] SImode values get ferried through FPUL and FP regs for -O0
  2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2014-12-27 11:41 ` olegendo at gcc dot gnu.org
@ 2014-12-27 13:56 ` olegendo at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-27 13:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54429

--- Comment #8 from Oleg Endo <olegendo at gcc dot gnu.org> ---
BTW, the problem is also there when using LRA.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-12-27 13:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-30 22:26 [Bug target/54429] New: [SH] SImode values get ferried through FPUL and FP regs for -O0 olegendo at gcc dot gnu.org
2012-08-31 10:59 ` [Bug target/54429] " kkojima at gcc dot gnu.org
2012-08-31 11:54 ` olegendo at gcc dot gnu.org
2012-11-13  8:45 ` olegendo at gcc dot gnu.org
2012-11-13 22:25 ` olegendo at gcc dot gnu.org
2012-11-14  9:45 ` olegendo at gcc dot gnu.org
2014-10-12 17:28 ` olegendo at gcc dot gnu.org
2014-12-27 11:41 ` olegendo at gcc dot gnu.org
2014-12-27 13:56 ` olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).