public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* PR target/16482: first scheduling pass on SH4
@ 2004-10-06 12:23 Kaz Kojima
  0 siblings, 0 replies; 9+ messages in thread
From: Kaz Kojima @ 2004-10-06 12:23 UTC (permalink / raw)
  To: gcc

Hi,

PR target/16482 is a regression for sh-elf from 3.4.  The following
part of gcc.c-torture/unsorted/SFset.c:

int glob1;
 
adrreg0limm1_set (float *p0)
{
  p0[10000000] = (float) glob1;
}

causes a spill failure for sh-elf on mainline with -m4 -O2.
It seems that the first insn scheduling which has been enabled for
SH-4 on mainline badly interacts with reloading.  The .sched file
for the problematic function looks like:

;;   ======================================================
;;   -- basic block 1 from 15 to 35 -- before reload
;;   ======================================================

;;	  0--> 36   r1=`__fpscr_values'                :(issue+load_store),nothing,memory
;;	  0--> 25   clobber r0                         :nothing
;;	  0--> 26   clobber r158                       :nothing
;;	  0--> 24   r0=r158                            :issue
;;	  1--> 17   r163=`glob1'                       :(issue+load_store),nothing,memory
;;	  1--> 27   use r0                             :nothing
;;	  2--> 37   =[r1]                              :d_lock,nothing,(F1+memory),F1*2
;;	  3--> 18   r165=[r163]                        :(issue+load_store),nothing,memory
;;	  4--> 15   r161=0x2625a00                     :(issue+load_store),nothing,memory
;;	  5--> 33   r1=`__fpscr_values'                :(issue+load_store),nothing,memory
;;	  6--> 19   {r164=flt(r165);use ;}             :issue,F01,F2
;;	  7--> 34   r166=r1+0x4                        :issue,int
;;	  8--> 20   {[r4+r161]=r164;use ;clobber scratc:(issue+load_store),nothing,memory
;;	  9--> 35   =[r166]                            :d_lock,nothing,(F1+memory),F1*2
;;	Ready list (final):  
;;   total time = 9
;;   new head = 16
;;   new tail = 35

Thus the first insn scheduling permutes [r4+r161]=r164 into a lifetime
of R0 and the spill process needs R0 for the expression for [r4+r161]
because SH-4 uses R0 as the base register for the addressing mode with
an index register.

Does the first insn scheduling for SH-4 take account of such insn which
uses R0 implicitly?  Or is it a problem of the reload pass?

I've tried to inhibit the permutation like above for the basic blocks
which have insns with [rN+rM] and R0 liftimes.  It works for me, but
what is the right thing to do?

Regards,
	kaz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PR target/16482: first scheduling pass on SH4
  2004-11-05 16:04         ` Joern RENNECKE
@ 2004-11-05 21:44           ` Kaz Kojima
  0 siblings, 0 replies; 9+ messages in thread
From: Kaz Kojima @ 2004-11-05 21:44 UTC (permalink / raw)
  To: joern.rennecke; +Cc: aoliva, tm_gccmail, gcc, gcc-patches, naveens, amylaar

Joern RENNECKE <joern.rennecke@st.com> wrote:
> The STMicroelectronics Copyright assignment papers have now been signed, 
> and are expected
> to be with the FSF by middle of next week.
> 
> I have written a patch for PR16482 today, as outlined in my analysis; if 
> you like, I can send to
> it you now.

Great!  Could you please send it to me?  I'll report the result.

Regards,
	kaz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PR target/16482: first scheduling pass on SH4
  2004-11-04  1:44       ` Kaz Kojima
@ 2004-11-05 16:04         ` Joern RENNECKE
  2004-11-05 21:44           ` Kaz Kojima
  0 siblings, 1 reply; 9+ messages in thread
From: Joern RENNECKE @ 2004-11-05 16:04 UTC (permalink / raw)
  To: Kaz Kojima; +Cc: aoliva, tm_gccmail, gcc, gcc-patches, naveens, amylaar

Kaz Kojima wrote:

>    
>
>I hope that those failures also come from the lcm.c problem suggested
>by Joern.
>  
>
 
The STMicroelectronics Copyright assignment papers have now been signed, 
and are expected
to be with the FSF by middle of next week.

I have written a patch for PR16482 today, as outlined in my analysis; if 
you like, I can send to
it you now.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PR target/16482: first scheduling pass on SH4
  2004-10-29 13:30     ` Kaz Kojima
  2004-10-29 14:36       ` tm_gccmail
@ 2004-11-04  1:44       ` Kaz Kojima
  2004-11-05 16:04         ` Joern RENNECKE
  1 sibling, 1 reply; 9+ messages in thread
From: Kaz Kojima @ 2004-11-04  1:44 UTC (permalink / raw)
  To: aoliva; +Cc: tm_gccmail, gcc, gcc-patches, joern.rennecke, naveens

> Alexandre Oliva <aoliva@redhat.com> wrote:
>>> 	* config/sh/sh.c (implicit_r0_use_block): New variable.
>>> 	(may_use_r0_in_reload, find_implicit_r0_use): New.
>>> 	(sh_md_init_global): Initialize and set implicit_r0_use_block.
>>> 	(sh_md_finish_global): Cleanup implicit_r0_use_block if needed.
>>> 	(implicit_r0_pressure): New.
>>> 	(sh_reorder): Use implicit_r0_pressure.
>>> 	(sh_reorder2): Likewise.
>>
>> I'm a bit concerned with this approach.  Consider, for example, a (mem
>> (reg)), in which reload finds this reg to be equivalent to a (plus
>> (reg) (reg)), or a (plus (reg) (const_int BIG)).  Both might end up
>> needing r0, and I don't quite see how you could prevent reload from
>> trying such a substitution.
> 
> Agreed.

BTW, I've seen many spill failures in compiling glibc with 4.0.0
for sh4 without -fno-schedule-insns.  Some of them can't be fixed
with my experimental patch.  For example,

(a)
strtod_l.c: In function 'round_and_return':
strtod_l.c:275: error: unable to find a register to spill in class 'R0_REGS'
strtod_l.c:275: error: this is the insn:
(insn:HI 39 32 34 2 strtod_l.c:198 (parallel [
            (set (reg:SF 64 fr0 [orig:160 D.8163 ] [160])
                (const_double:SF 0.0 [0x0.0p+0]))
            (use (reg/v:PSI 151 ))
            (clobber (scratch:SI))
        ]) 157 {movsf_ie} (nil)
    (expr_list:REG_UNUSED (scratch:SI)
        (nil)))

(b)
initgroups.c: In function 'initgroups':
initgroups.c:220: error: unable to find a register to spill in class 'R0_REGS'
initgroups.c:220: error: this is the insn:
(insn:HI 86 85 92 7 initgroups.c:214 (set (reg:SI 147 t)
        (eq:SI (reg/v:SI 10 r10 [orig:161 result ] [161])
            (const_int -1 [0xffffffffffffffff]))) 1 {cmpeqsi_t} (insn_list:REG_DEP_ANTI 81 (insn_list:REG_DEP_ANTI 82 (insn_list:REG_DEP_ANTI 83 (insn_list:REG_DEP_TRUE 84 (nil)))))
    (nil))

So your are quite right.  My approach wouldn't work at all for (b).
I hope that those failures also come from the lcm.c problem suggested
by Joern.

Regards,
	kaz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PR target/16482: first scheduling pass on SH4
  2004-10-29 13:30     ` Kaz Kojima
@ 2004-10-29 14:36       ` tm_gccmail
  2004-11-04  1:44       ` Kaz Kojima
  1 sibling, 0 replies; 9+ messages in thread
From: tm_gccmail @ 2004-10-29 14:36 UTC (permalink / raw)
  To: Kaz Kojima; +Cc: aoliva, gcc, gcc-patches, joern.rennecke, naveens

On Fri, 29 Oct 2004, Kaz Kojima wrote:

> > I'm a bit concerned with this approach.  Consider, for example, a (mem
> > (reg)), in which reload finds this reg to be equivalent to a (plus
> > (reg) (reg)), or a (plus (reg) (const_int BIG)).  Both might end up
> > needing r0, and I don't quite see how you could prevent reload from
> > trying such a substitution.
> 
> Agreed.
> 
> > That said, your patch definitely fixes a real problem.  I'm just
> > concerned that it doesn't fix it completely.  The approach feels a bit
> > like overkill, but I could live with it.  Jörn, any concerns?  Anyone,
> > any further thoughts?
> 
> Jörn has pointed out in #4 of PR description thread that the real
> problem is in lcm.c:optimize_mode_switching.
> 
> Regards,
> 	kaz

Eventually, we should keep track of R0 register pressure separately, and
force instructions which free R0 to be emitted first.

But I guess this needs to stay for now.

Toshi



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PR target/16482: first scheduling pass on SH4
  2004-10-29 12:21   ` Alexandre Oliva
@ 2004-10-29 13:30     ` Kaz Kojima
  2004-10-29 14:36       ` tm_gccmail
  2004-11-04  1:44       ` Kaz Kojima
  0 siblings, 2 replies; 9+ messages in thread
From: Kaz Kojima @ 2004-10-29 13:30 UTC (permalink / raw)
  To: aoliva; +Cc: tm_gccmail, gcc, gcc-patches, joern.rennecke, naveens

Alexandre Oliva <aoliva@redhat.com> wrote:
>> 	* config/sh/sh.c (implicit_r0_use_block): New variable.
>> 	(may_use_r0_in_reload, find_implicit_r0_use): New.
>> 	(sh_md_init_global): Initialize and set implicit_r0_use_block.
>> 	(sh_md_finish_global): Cleanup implicit_r0_use_block if needed.
>> 	(implicit_r0_pressure): New.
>> 	(sh_reorder): Use implicit_r0_pressure.
>> 	(sh_reorder2): Likewise.
>
> I'm a bit concerned with this approach.  Consider, for example, a (mem
> (reg)), in which reload finds this reg to be equivalent to a (plus
> (reg) (reg)), or a (plus (reg) (const_int BIG)).  Both might end up
> needing r0, and I don't quite see how you could prevent reload from
> trying such a substitution.

Agreed.

> That said, your patch definitely fixes a real problem.  I'm just
> concerned that it doesn't fix it completely.  The approach feels a bit
> like overkill, but I could live with it.  Jörn, any concerns?  Anyone,
> any further thoughts?

Jörn has pointed out in #4 of PR description thread that the real
problem is in lcm.c:optimize_mode_switching.

Regards,
	kaz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PR target/16482: first scheduling pass on SH4
  2004-10-15 15:52 ` Kaz Kojima
@ 2004-10-29 12:21   ` Alexandre Oliva
  2004-10-29 13:30     ` Kaz Kojima
  0 siblings, 1 reply; 9+ messages in thread
From: Alexandre Oliva @ 2004-10-29 12:21 UTC (permalink / raw)
  To: Kaz Kojima; +Cc: tm_gccmail, gcc, gcc-patches, joern.rennecke, naveens

On Oct 15, 2004, Kaz Kojima <kkojima@rr.iij4u.or.jp> wrote:

> 	* config/sh/sh.c (implicit_r0_use_block): New variable.
> 	(may_use_r0_in_reload, find_implicit_r0_use): New.
> 	(sh_md_init_global): Initialize and set implicit_r0_use_block.
> 	(sh_md_finish_global): Cleanup implicit_r0_use_block if needed.
> 	(implicit_r0_pressure): New.
> 	(sh_reorder): Use implicit_r0_pressure.
> 	(sh_reorder2): Likewise.

I'm a bit concerned with this approach.  Consider, for example, a (mem
(reg)), in which reload finds this reg to be equivalent to a (plus
(reg) (reg)), or a (plus (reg) (const_int BIG)).  Both might end up
needing r0, and I don't quite see how you could prevent reload from
trying such a substitution.

That said, your patch definitely fixes a real problem.  I'm just
concerned that it doesn't fix it completely.  The approach feels a bit
like overkill, but I could live with it.  Jörn, any concerns?  Anyone,
any further thoughts?

-- 
Alexandre Oliva             http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PR target/16482: first scheduling pass on SH4
  2004-10-15 10:56 tm_gccmail
@ 2004-10-15 15:52 ` Kaz Kojima
  2004-10-29 12:21   ` Alexandre Oliva
  0 siblings, 1 reply; 9+ messages in thread
From: Kaz Kojima @ 2004-10-15 15:52 UTC (permalink / raw)
  To: tm_gccmail; +Cc: gcc, gcc-patches, joern.rennecke, aoliva, naveens

[I'd like to cc this message to gcc-patches and SH port maintainers
because it includes a patch anyway.]

tm_gccmail@kloo.net wrote:
>> Does the first insn scheduling for SH-4 take account of such insn which
>> uses R0 implicitly?  Or is it a problem of the reload pass?
>
> The first instruction scheduling pass cannot extend the lifetime of
> pseudos which need r0 since there is only one r0 register.
>
> So yes, this sounds like an SH first instruction scheduling pass bug.

Thanks for your explanation.  I've attached the patch which I mentioned
in http://gcc.gnu.org/ml/gcc/2004-10/msg00176.html, though it could be
overkill.  It bootstraps on sh4-unknown-linux-gnu with no new regressions.

> Incidentally, I can reproduce the same problem with a small array index
> and -fPIC. Compile this with -O2 -mf -fPIC:
>
> int glob1;
>
> adrreg01limm1_set (float *p0)
> {
>         p0[10] = (float) glob1;
> }

Wow, too simple :-)  I've confirmed that the patch gets rid of the ICE
for it.

Regards,
	kaz
--
	* config/sh/sh.c (implicit_r0_use_block): New variable.
	(may_use_r0_in_reload, find_implicit_r0_use): New.
	(sh_md_init_global): Initialize and set implicit_r0_use_block.
	(sh_md_finish_global): Cleanup implicit_r0_use_block if needed.
	(implicit_r0_pressure): New.
	(sh_reorder): Use implicit_r0_pressure.
	(sh_reorder2): Likewise.

diff -uprN ORIG/gcc/gcc/config/sh/sh.c LOCAL/gcc/gcc/config/sh/sh.c
--- ORIG/gcc/gcc/config/sh/sh.c	2004-10-08 07:46:30.000000000 +0900
+++ LOCAL/gcc/gcc/config/sh/sh.c	2004-10-15 14:08:12.000000000 +0900
@@ -113,6 +113,10 @@ static short *regmode_weight[2];
 /* Total SFmode and SImode weights of scheduled insns.  */
 static int curr_regmode_pressure[2];
 
+/* Array indexed by basic block index whose element shows that an insn in
+   this basic block may use r0 in reload.  */
+static bool *implicit_r0_use_block;
+
 /* If true, skip cycles for Q -> R movement.  */
 static int skip_cycles = 0;
 
@@ -8744,6 +8748,73 @@ find_regmode_weight (int b, enum machine
     }
 }
 
+/* Return true if the pattern may use r0 in reload.  */
+static bool
+may_use_r0_in_reload (rtx x)
+{
+  /* The expression (mem (plus (reg) (reg))) may cause a reload with r0.  */
+  if (GET_CODE (x) == SET)
+    {
+      rtx m;
+
+      if (GET_CODE (SET_SRC (x)) == MEM)
+	m = SET_SRC (x);
+      else if (GET_CODE (SET_DEST (x)) == MEM)
+	m = SET_DEST (x);
+      else 
+	m = NULL_RTX;
+
+      if (m)
+	{
+	  if (GET_CODE (XEXP (m, 0)) == PLUS)
+	    {
+	      rtx base = XEXP (XEXP (m, 0), 0);
+	      rtx ofs = XEXP (XEXP (m, 0), 1);
+
+	      if ((GET_CODE (base) == REG || GET_CODE (base) == SUBREG)
+		  && (GET_CODE (ofs) == REG || GET_CODE (ofs) == SUBREG))
+		return true;
+	    }
+	}
+    }
+
+  if (GET_CODE (x) == PARALLEL)
+    {
+      int j;
+      for (j = XVECLEN (x, 0) - 1; j >= 0; j--)
+	{
+	  rtx y = XVECEXP (x, 0, j);
+	  if (may_use_r0_in_reload (y))
+	    return true;
+	}
+    }
+
+  return false;
+}
+
+/* Set implicit_r0_use_block[B] if the basic block B includes an insn which
+   may use r0 in reload.  */
+static void
+find_implicit_r0_use (int b)
+{
+  rtx insn, next_tail, head, tail;
+
+  get_block_head_tail (b, &head, &tail);
+  next_tail = NEXT_INSN (tail);
+
+  for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
+    {
+      if (!INSN_P (insn))
+	continue;
+
+      if (may_use_r0_in_reload (PATTERN (insn)))
+	{
+	  implicit_r0_use_block[b] = true;
+	  break;
+	}
+    }
+}
+
 /* Comparison function for ready queue sorting.  */
 static int
 rank_for_reorder (const void *x, const void *y)
@@ -8801,14 +8872,18 @@ sh_md_init_global (FILE *dump ATTRIBUTE_
 		   int old_max_uid)
 {
   basic_block b;
+  int max_bb;
 
   regmode_weight[0] = (short *) xcalloc (old_max_uid, sizeof (short));
   regmode_weight[1] = (short *) xcalloc (old_max_uid, sizeof (short));
+  max_bb = VARRAY_SIZE (basic_block_info);
+  implicit_r0_use_block = (bool *) xcalloc (max_bb, sizeof (bool));
 
   FOR_EACH_BB_REVERSE (b)
   {
     find_regmode_weight (b->index, SImode);
     find_regmode_weight (b->index, SFmode);
+    find_implicit_r0_use (b->index);
   }
 
   CURR_REGMODE_PRESSURE (SImode) = 0;
@@ -8831,6 +8906,11 @@ sh_md_finish_global (FILE *dump ATTRIBUT
       free (regmode_weight[1]);
       regmode_weight[1] = NULL;
     }
+  if (implicit_r0_use_block)
+    {
+      free (implicit_r0_use_block);
+      implicit_r0_use_block = NULL;
+    }
 }
 
 /* Cache the can_issue_more so that we can return it from reorder2. Also,
@@ -8890,6 +8970,20 @@ high_pressure (enum machine_mode mode)
     return (CURR_REGMODE_PRESSURE (SImode) > SIMODE_MAX_WEIGHT);
 }
 
+/* Return true if there is an insn which may use r0 in ready queue
+   and there is an explicit r0 use.  */
+static bool
+implicit_r0_pressure (rtx *ready, int n ATTRIBUTE_UNUSED)
+{
+  return true;
+
+  if (REG_N_SETS (0) > 0
+      && implicit_r0_use_block[BLOCK_NUM (ready[0])])
+    return true;
+
+  return false;
+}
+
 /* Reorder ready queue if register pressure is high.  */
 static int
 sh_reorder (FILE *dump ATTRIBUTE_UNUSED,
@@ -8901,7 +8995,8 @@ sh_reorder (FILE *dump ATTRIBUTE_UNUSED,
   if (reload_completed)
     return sh_issue_rate ();
 
-  if (high_pressure (SFmode) || high_pressure (SImode))
+  if (high_pressure (SFmode) || high_pressure (SImode)
+      || implicit_r0_pressure (ready, *n_readyp))
     {
       ready_reorder (ready, *n_readyp);
     }
@@ -8920,7 +9015,8 @@ sh_reorder2 (FILE *dump ATTRIBUTE_UNUSED
   if (reload_completed)
     return cached_can_issue_more;
 
-  if (high_pressure(SFmode) || high_pressure (SImode))
+  if (high_pressure (SFmode) || high_pressure (SImode)
+      || implicit_r0_pressure (ready, *n_readyp)) 
     skip_cycles = 1;
 
   return cached_can_issue_more;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* PR target/16482: first scheduling pass on SH4
@ 2004-10-15 10:56 tm_gccmail
  2004-10-15 15:52 ` Kaz Kojima
  0 siblings, 1 reply; 9+ messages in thread
From: tm_gccmail @ 2004-10-15 10:56 UTC (permalink / raw)
  To: kkojima, naveens; +Cc: gcc


Sorry, deleted the original message to which I should be replying, which
screws up the threading, but anyway:

Original message: http://gcc.gnu.org/ml/gcc/2004-10/msg00176.html

Kaz Kojima wrote:

> Thus the first insn scheduling permutes [r4+r161]=r164 into a lifetime
> of R0 and the spill process needs R0 for the expression for [r4+r161]
> because SH-4 uses R0 as the base register for the addressing mode with
> an index register.
>
> Does the first insn scheduling for SH-4 take account of such insn which
> uses R0 implicitly?  Or is it a problem of the reload pass?

The first instruction scheduling pass cannot extend the lifetime of
pseudos which need r0 since there is only one r0 register.

So yes, this sounds like an SH first instruction scheduling pass bug.

Incidentally, I can reproduce the same problem with a small array index
and -fPIC. Compile this with -O2 -mf -fPIC:

int glob1;

adrreg01limm1_set (float *p0)
{
        p0[10] = (float) glob1;
}

test2.c: In function 'adrreg01limm1_set':
test2.c:7: error: unable to find a register to spill in class 'R0_REGS'
test2.c:7: error: this is the insn:
(insn 49 50 18 0 (set (reg/f:SI 1 r1 [162])
        (mem/u:SI (plus:SI (reg:SI 12 r12)
                (reg/f:SI 1 r1 [163])) [0 S4 A32])) 129 {movsi_ie} (nil)
    (expr_list:REG_DEAD (reg/f:SI 1 r1 [163])
        (expr_list:REG_EQUIV (symbol_ref:SI ("glob1") <var_decl 0x401c215c
glob1>)
            (nil))))
test2.c:7: internal compiler error: in spill_failure, at /reload1.c:1881

Toshi

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-11-05 21:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-06 12:23 PR target/16482: first scheduling pass on SH4 Kaz Kojima
2004-10-15 10:56 tm_gccmail
2004-10-15 15:52 ` Kaz Kojima
2004-10-29 12:21   ` Alexandre Oliva
2004-10-29 13:30     ` Kaz Kojima
2004-10-29 14:36       ` tm_gccmail
2004-11-04  1:44       ` Kaz Kojima
2004-11-05 16:04         ` Joern RENNECKE
2004-11-05 21:44           ` Kaz Kojima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).