public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Scheduler questions (related to PR17808)
@ 2005-06-29 22:46 Steven Bosscher
  2005-06-29 23:30 ` Dale Johannesen
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Steven Bosscher @ 2005-06-29 22:46 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: rearnsha, gcc


Hi,

I have a question about the scheduler.  Forgive me if I'm totally
missing the point here, this scheduling business is not my thing ;-)

Consider the following snippet that I've derived from PR17808 with a
few hacks in the compiler to renumber insns and dump RTL with all the
dependencies before scheduling.  There is a predicate register that
gets set, then a few cond_exec insns, then a jump, and finally a set
using some of the registers that may be set by the cond_exec insns.
This is the RTL before scheduling:

(insn 8 7 9 0 (set (reg:BI 262 p6 [353])
        (ne:BI (reg/v:SI 15 r15 [orig:348 b1 ] [348])
            (const_int 0 [0x0]))) 226 {*cmpsi_normal} (insn_list:REG_DEP_TRUE 7 (nil))
    (nil))

(insn 9 8 10 0 (cond_exec (ne (reg:BI 262 p6 [353])
            (const_int 0 [0x0]))
        (set (reg/v/f:DI 14 r14 [orig:347 t16 ] [347])
            (reg/v/f:DI 112 r32 [orig:351 t ] [351]))) 680 {sync_lock_releasedi+5} (insn_list:REG_DEP_TRUE 8 (nil))
    (nil))

(insn 10 9 11 0 (cond_exec (ne (reg:BI 262 p6 [353])
            (const_int 0 [0x0]))
        (set (reg/v:SI 17 r17 [orig:346 iftmp ] [346])
            (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil))
    (nil))

(insn 11 10 12 0 (cond_exec (ne (reg:BI 262 p6 [353])
            (const_int 0 [0x0]))
        (set (reg/v:SI 16 r16 [orig:349 i ] [349])
            (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil))
    (nil))

(jump_insn 12 11 13 0 (set (pc)
        (if_then_else (eq (reg:BI 262 p6 [353])
                (const_int 0 [0x0]))
            (label_ref:DI 39)
            (pc))) 235 {*br_true} (insn_list:REG_DEP_TRUE 8 (insn_list:REG_DEP_ANTI 7 (nil)))
    (expr_list:REG_DEAD (reg:BI 262 p6 [353])
        (expr_list:REG_BR_PROB (const_int 3300 [0xce4])
            (nil))))
;; End of basic block 0, registers live:
 1 [r1] 12 [r12] 14 [r14] 15 [r15] 16 [r16] 17 [r17] 18 [r18] 112 [r32] 320 [b0] 331 [ar.pfs]

;; Start of basic block 1, registers live: 1 [r1] 12 [r12] 14 [r14] 16 [r16] 17 [r17] 18 [r18] 112 [r32] 320 [b0] 331 [ar.pfs]
(note 13 12 14 1 [bb 1] NOTE_INSN_BASIC_BLOCK)

(insn 14 13 15 1 (set (mem:SI (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) [2 S4 A32])
        (reg/v:SI 17 r17 [orig:346 iftmp ] [346])) 4 {*movsi_internal} (insn_list:REG_DEP_TRUE 12 (insn_list:REG_DEP_ANTI 7 (nil)))
    (expr_list:REG_DEAD (reg/v:SI 17 r17 [orig:346 iftmp ] [346])
        (expr_list:REG_DEAD (reg/v/f:DI 14 r14 [orig:347 t16 ] [347])
            (nil))))


Then the ia64 machine-reorg scheduler gets to work, and it produces:

(insn:TI 8 70 12 0 (set (reg:BI 262 p6 [353])
        (ne:BI (reg/v:SI 15 r15 [orig:348 b1 ] [348])
            (const_int 0 [0x0]))) 226 {*cmpsi_normal} (insn_list:REG_DEP_TRUE 7 (nil))
    (nil))

(jump_insn 12 8 77 0 (set (pc)
        (if_then_else (eq (reg:BI 262 p6 [353])
                (const_int 0 [0x0]))
            (label_ref:DI 39)
            (pc))) 235 {*br_true} (insn_list:REG_DEP_TRUE 8 (insn_list:REG_DEP_ANTI 7 (nil)))
    (expr_list:REG_DEAD (reg:BI 262 p6 [353])
        (expr_list:REG_BR_PROB (const_int 3300 [0xce4])
            (nil))))

(note 77 12 69 1 [bb 1] NOTE_INSN_BASIC_BLOCK)

(... other non-jump, non-cond_exec insns ...)

(insn 14 15 16 1 (set (mem:SI (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) [2 S4 A32])
        (reg/v:SI 17 r17 [orig:346 iftmp ] [346])) 4 {*movsi_internal} (insn_list:REG_DEP_TRUE 12 (insn_list:REG_DEP_ANTI 7 (nil)))
    (expr_list:REG_DEAD (reg/v:SI 17 r17 [orig:346 iftmp ] [346])
        (expr_list:REG_DEAD (reg/v/f:DI 14 r14 [orig:347 t16 ] [347])
            (nil))))

(... other non-jump, non-cond_exec insns ...)

(insn 10 18 11 1 (cond_exec (ne (reg:BI 262 p6 [353])
            (const_int 0 [0x0]))
        (set (reg/v:SI 17 r17 [orig:346 iftmp ] [346])
            (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil))
    (nil))

(insn 11 10 67 1 (cond_exec (ne (reg:BI 262 p6 [353])
            (const_int 0 [0x0]))
        (set (reg/v:SI 16 r16 [orig:349 i ] [349])
            (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil))
    (nil))

(... other non-jump, non-cond_exec insns ...)

(insn:TI 9 60 65 1 (cond_exec (ne (reg:BI 262 p6 [353])
            (const_int 0 [0x0]))
        (set (reg/v/f:DI 14 r14 [orig:347 t16 ] [347])
            (reg/v/f:DI 112 r32 [orig:351 t ] [351]))) 680 {sync_lock_releasedi+5} (insn_list:REG_DEP_TRUE 8 (nil))
    (nil))


Notice how the conditional sets of r14 and r17 in insns 9 and 10 have
been moved past insn 14, which uses these registers.  Shouldn't there
be true dependencies on insns 9 and 10 for insn 14?

Gr.
Steven

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-29 22:46 Scheduler questions (related to PR17808) Steven Bosscher
@ 2005-06-29 23:30 ` Dale Johannesen
  2005-06-30  9:32 ` Richard Earnshaw
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Dale Johannesen @ 2005-06-29 23:30 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: rearnsha, gcc, Dale Johannesen, Vladimir Makarov

On Jun 29, 2005, at 3:46 PM, Steven Bosscher wrote:
> I have a question about the scheduler.  Forgive me if I'm totally
> missing the point here, this scheduling business is not my thing ;-)
>
> Consider the following snippet that I've derived from PR17808 with a
> few hacks in the compiler to renumber insns and dump RTL with all the
> dependencies before scheduling.  There is a predicate register that
> gets set, then a few cond_exec insns, then a jump, and finally a set
> using some of the registers that may be set by the cond_exec insns.
> This is the RTL before scheduling:
>
> <snip>

> Notice how the conditional sets of r14 and r17 in insns 9 and 10 have
> been moved past insn 14, which uses these registers.  Shouldn't there
> be true dependencies on insns 9 and 10 for insn 14?

I think so.  This is figured out in sched_analyze_insn in sched-deps.c, 
I'd
suggest stepping through there.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-29 22:46 Scheduler questions (related to PR17808) Steven Bosscher
  2005-06-29 23:30 ` Dale Johannesen
@ 2005-06-30  9:32 ` Richard Earnshaw
  2005-06-30 10:22   ` Steven Bosscher
  2005-06-30 13:50 ` Vladimir Makarov
  2005-06-30 14:20 ` Mostafa Hagog
  3 siblings, 1 reply; 19+ messages in thread
From: Richard Earnshaw @ 2005-06-30  9:32 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Vladimir Makarov, gcc

On Wed, 2005-06-29 at 23:46, Steven Bosscher wrote:
> Hi,
> 
> I have a question about the scheduler.  Forgive me if I'm totally
> missing the point here, this scheduling business is not my thing ;-)
> 
> Consider the following snippet that I've derived from PR17808 with a
> few hacks in the compiler to renumber insns and dump RTL with all the
> dependencies before scheduling.  

[...]
> 
> Notice how the conditional sets of r14 and r17 in insns 9 and 10 have
> been moved past insn 14, which uses these registers.  Shouldn't there
> be true dependencies on insns 9 and 10 for insn 14?

Hmm, certainly looks suspicious, but it's not really what the original
bug report was about.  This might be some interblock scheduling problem
that is also triggered by the extra scheduling freedom we get with the
disabled change.

The original problem was one where a BB with 2 output edges could have a
non-branch insn as the final instruction.  This can't happen with
non-predicated machines, but when you have predicated insns the
execution path that leads to the non-fallthrough edge may not need to
suppress all the non-executed insns - we can just move them after the
branch.  Unfortunately, later passes can't handle this (they expect a BB
with multiple exit edges to have a branch as the final insn.

I think the way to handle that is to push the 'left-over' insns onto the
fall-through edge and then create a new BB for them.  If there is no
code label after the fall-through, then the new BB can be coalesced with
the fall-through block, otherwise it will become a new BB in its own
right.  It might even be possible to remove the predication on the
instructions if they all use the same predicate register and the
condition is the inverse of the branch condition.

R.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30  9:32 ` Richard Earnshaw
@ 2005-06-30 10:22   ` Steven Bosscher
  2005-06-30 10:53     ` Richard Earnshaw
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Bosscher @ 2005-06-30 10:22 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: Vladimir Makarov, gcc

On Thursday 30 June 2005 11:32, Richard Earnshaw wrote:

> Hmm, certainly looks suspicious, but it's not really what the original
> bug report was about.  This might be some interblock scheduling problem
> that is also triggered by the extra scheduling freedom we get with the
> disabled change.

Hmm, right, but both related to COND_EXECs.  Looks very broken ;-)

> I think the way to handle that is to push the 'left-over' insns onto the
> fall-through edge and then create a new BB for them.  If there is no
> code label after the fall-through, then the new BB can be coalesced with
> the fall-through block, otherwise it will become a new BB in its own
> right.  It might even be possible to remove the predication on the
> instructions if they all use the same predicate register and the
> condition is the inverse of the branch condition.

Well, if I understand correctly, there is normally no interblock
scheduling after reload.  IA-64 is just special (as usual), but even
then, the scheduler is not really even aware of basic blocks at all.

For your problem, I think the jump should just be forced somehow to
be the last insn in the block.  You can't move things into other
basic blocks if you are not doing interblock scheduling, after all.
For the PR I was looking at, someone who actually understands the
scheduler will have to figure out a fix :-/

Gr.
Steven

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 10:22   ` Steven Bosscher
@ 2005-06-30 10:53     ` Richard Earnshaw
  2005-06-30 11:57       ` Steven Bosscher
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Earnshaw @ 2005-06-30 10:53 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Vladimir Makarov, gcc

On Thu, 2005-06-30 at 11:22, Steven Bosscher wrote:
> For your problem, I think the jump should just be forced somehow to
> be the last insn in the block.  You can't move things into other
> basic blocks if you are not doing interblock scheduling, after all.
> For the PR I was looking at, someone who actually understands the
> scheduler will have to figure out a fix :-/

The PR is exactly the point that we *don't* want to force the jump to
the end of the block: we want to put the branch at the optimal point in
the schedule and move any left over conditional insns to a place where
they don't waste time when they don't compute anything.  There's no
point in conditionally not executing an instruction if it takes a cycle
to do so with no benefit.  Consider the following instruction sequence
(where C and c are opposite conditions):

	T = [addr]
	C? addr += 4
	C? X += 12
	c? T += 1
	c? Jump foo

Now on a processor with a one cycle stall on a memory access the optimal
sequence here is

	T = [addr]
	C? addr += 4
	c? T += 1
	c? jump foo
	(C?) X += 12 // Note: C? is optional -- will always be true

We don't want to put the 'X += 12' before the branch because it just
wastes a cycle of execution time when the branch is taken.

R.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 10:53     ` Richard Earnshaw
@ 2005-06-30 11:57       ` Steven Bosscher
  0 siblings, 0 replies; 19+ messages in thread
From: Steven Bosscher @ 2005-06-30 11:57 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: Vladimir Makarov, gcc

On Thursday 30 June 2005 12:53, Richard Earnshaw wrote:
> On Thu, 2005-06-30 at 11:22, Steven Bosscher wrote:
> > For your problem, I think the jump should just be forced somehow to
> > be the last insn in the block.  You can't move things into other
> > basic blocks if you are not doing interblock scheduling, after all.
> > For the PR I was looking at, someone who actually understands the
> > scheduler will have to figure out a fix :-/
>
> The PR is exactly the point that we *don't* want to force the jump to
> the end of the block: we want to put the branch at the optimal point in
> the schedule and move any left over conditional insns to a place where
> they don't waste time when they don't compute anything.

Maybe that is your interpretation of that PR, but not mine (and I filed
it).  My interpretation is that there is a problem that if some correct
code in sched-deps.c is enabled, things break elsewhere.

> There's no 
> point in conditionally not executing an instruction if it takes a cycle
> to do so with no benefit.  Consider the following instruction sequence
> (where C and c are opposite conditions):
>
> 	T = [addr]
> 	C? addr += 4
> 	C? X += 12
> 	c? T += 1
> 	c? Jump foo
>
> Now on a processor with a one cycle stall on a memory access the optimal
> sequence here is
>
> 	T = [addr]
> 	C? addr += 4
> 	c? T += 1
> 	c? jump foo
> 	(C?) X += 12 // Note: C? is optional -- will always be true
>
> We don't want to put the 'X += 12' before the branch because it just
> wastes a cycle of execution time when the branch is taken.

So?  If you want to move the branch to an optimal point, you should
enable interblock scheduling.  You can't do intrablock scheduling and
at the same expect the scheduler to move insns to other blocks.  By
definition of "intra-block", you can't move insns to other blocks.

Gr.
Steven

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-29 22:46 Scheduler questions (related to PR17808) Steven Bosscher
  2005-06-29 23:30 ` Dale Johannesen
  2005-06-30  9:32 ` Richard Earnshaw
@ 2005-06-30 13:50 ` Vladimir Makarov
  2005-06-30 14:19   ` Andrey Belevantsev
  2005-06-30 14:20 ` Mostafa Hagog
  3 siblings, 1 reply; 19+ messages in thread
From: Vladimir Makarov @ 2005-06-30 13:50 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: rearnsha, gcc

Steven Bosscher wrote:

>Notice how the conditional sets of r14 and r17 in insns 9 and 10 have
>been moved past insn 14, which uses these registers.  Shouldn't there
>be true dependencies on insns 9 and 10 for insn 14?
>
>  
>
In theory, yes.  But the scheduler (it is ebb scheduler as I understand) 
frequently removes dependencies creating a barrier and all dependencies 
to the insns before the barrier are changed by dependencies to the 
barrier (in this case the barrier is the jump insn and dependency 14->9 
is changed by 14->the jump).  The code should work ok if there is no 
movement down through jump insn (it is how the scheduler, ebb or 
interblock one, should work).   But the scheduler did it for insn #9.

I'll look at this PR today.

Vlad


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 13:50 ` Vladimir Makarov
@ 2005-06-30 14:19   ` Andrey Belevantsev
  2005-06-30 20:29     ` Steven Bosscher
                       ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Andrey Belevantsev @ 2005-06-30 14:19 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Steven Bosscher, rearnsha, gcc

[-- Attachment #1: Type: text/plain, Size: 940 bytes --]

Vladimir Makarov wrote:
> I'll look at this PR today.

We've looked today at this issue. We think the problem is that proposed 
patch of sched_get_condition() treats conditional jumps likely to 
COND_EXECs, but it doesn't fix other places in sched-deps, where 
COND_EXECs are considered. Maxim Kuvyrkov proposed the attached patch, 
which allows gcc to bootstrap on ia64 and fixes the testcase in PR.

We've also found that current mainline ICEs compiling the testcase with 
"-O0 -fschedule-insns -fschedule-insns2". That is because after reload 
several pseudos still remain in global_live_at_start sets. The pseudos 
then appear in regsets through compute_jump_reg_dependencies(), and 
sched-deps segfaults at EXECUTE_IF_SET_IN_REG_SET loop at sched-deps.c:948.

We don't know reload well enough to know for sure which place should be 
fixed in reload, or maybe in update_life_info(). Is this issue worth 
opening another PR?

Andrey





[-- Attachment #2: gcc-sched-deps.patch --]
[-- Type: text/plain, Size: 1125 bytes --]

--- gcc/gcc/sched-deps.c	Sun Jun 19 16:37:49 2005
+++ orig/gcc/sched-deps.c	Thu Jun 30 18:00:23 2005
@@ -149,7 +149,7 @@
     return 0;
 
   src = SET_SRC (pc_set (insn));
-#if 0
+#if 1
   /* The previous code here was completely invalid and could never extract
      the condition from a jump.  This code does the correct thing, but that
      triggers latent bugs later in the scheduler on ports with conditional
@@ -1019,7 +1019,8 @@
     {
       /* In the case of barrier the most added dependencies are not
          real, so we use anti-dependence here.  */
-      if (GET_CODE (PATTERN (insn)) == COND_EXEC)
+      /* if (GET_CODE (PATTERN (insn)) == COND_EXEC)  */
+      if (sched_get_condition (insn))
 	{
 	  EXECUTE_IF_SET_IN_REG_SET (&deps->reg_last_in_use, 0, i, rsi)
 	    {
@@ -1066,7 +1067,8 @@
     {
       /* If the current insn is conditional, we can't free any
 	 of the lists.  */
-      if (GET_CODE (PATTERN (insn)) == COND_EXEC)
+      /* if (GET_CODE (PATTERN (insn)) == COND_EXEC)  */
+      if (sched_get_condition (insn))
 	{
 	  EXECUTE_IF_SET_IN_REG_SET (reg_pending_uses, 0, i, rsi)
 	    {

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-29 22:46 Scheduler questions (related to PR17808) Steven Bosscher
                   ` (2 preceding siblings ...)
  2005-06-30 13:50 ` Vladimir Makarov
@ 2005-06-30 14:20 ` Mostafa Hagog
  2005-06-30 16:47   ` Steven Bosscher
  3 siblings, 1 reply; 19+ messages in thread
From: Mostafa Hagog @ 2005-06-30 14:20 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, rearnsha, Vladimir Makarov





Hi,

Steven Bosscher <stevenb@suse.de> wrote on 30/06/2005 01:46:22:

>
> Hi,
>
[snip]
> Then the ia64 machine-reorg scheduler gets to work, and it produces:
>
> (insn:TI 8 70 12 0 (set (reg:BI 262 p6 [353])
>         (ne:BI (reg/v:SI 15 r15 [orig:348 b1 ] [348])
>             (const_int 0 [0x0]))) 226 {*cmpsi_normal}
(insn_list:REG_DEP_TRUE 7 (nil))
>     (nil))
>
> (jump_insn 12 8 77 0 (set (pc)
>         (if_then_else (eq (reg:BI 262 p6 [353])
>                 (const_int 0 [0x0]))
>             (label_ref:DI 39)
                         ^^^^^
It would help to know what is the target of this jump;
If it is bb1 then the move of the insn 14 above 9 and 10
is legal because the predicate doesn't hold for them.
I am not sure if the scheduler is that smart to make such
decisions but it would help to get all of the RTL to get
the full picture.

Mostafa.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 14:20 ` Mostafa Hagog
@ 2005-06-30 16:47   ` Steven Bosscher
  0 siblings, 0 replies; 19+ messages in thread
From: Steven Bosscher @ 2005-06-30 16:47 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc, rearnsha, Vladimir Makarov

[-- Attachment #1: Type: text/plain, Size: 1028 bytes --]

On Thursday 30 June 2005 16:19, Mostafa Hagog wrote:
> Hi,
>
> Steven Bosscher <stevenb@suse.de> wrote on 30/06/2005 01:46:22:
> > Hi,
>
> [snip]
>
> > Then the ia64 machine-reorg scheduler gets to work, and it produces:
> >
> > (insn:TI 8 70 12 0 (set (reg:BI 262 p6 [353])
> >         (ne:BI (reg/v:SI 15 r15 [orig:348 b1 ] [348])
> >             (const_int 0 [0x0]))) 226 {*cmpsi_normal}
>
> (insn_list:REG_DEP_TRUE 7 (nil))
>
> >     (nil))
> >
> > (jump_insn 12 8 77 0 (set (pc)
> >         (if_then_else (eq (reg:BI 262 p6 [353])
> >                 (const_int 0 [0x0]))
> >             (label_ref:DI 39)
>
>                          ^^^^^
> It would help to know what is the target of this jump;
> If it is bb1 then the move of the insn 14 above 9 and 10
> is legal because the predicate doesn't hold for them.
> I am not sure if the scheduler is that smart to make such
> decisions but it would help to get all of the RTL to get
> the full picture.

I have attached the full .mach dump from which I quoted.

Gr.
Steven


[-- Attachment #2: t.c.35.mach.gz --]
[-- Type: application/x-gzip, Size: 10817 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 14:19   ` Andrey Belevantsev
@ 2005-06-30 20:29     ` Steven Bosscher
  2005-06-30 21:07     ` Steven Bosscher
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Steven Bosscher @ 2005-06-30 20:29 UTC (permalink / raw)
  To: Andrey Belevantsev; +Cc: Vladimir Makarov, rearnsha, gcc

On Thursday 30 June 2005 16:18, Andrey Belevantsev wrote:
> We don't know reload well enough to know for sure which place should be
> fixed in reload, or maybe in update_life_info(). Is this issue worth
> opening another PR?

This is a separate PR.

Actually, I'm surprised that -O0 -fschedule-insns works at all.

Gr.
Steven

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 14:19   ` Andrey Belevantsev
  2005-06-30 20:29     ` Steven Bosscher
@ 2005-06-30 21:07     ` Steven Bosscher
  2005-06-30 22:11       ` James E Wilson
  2005-06-30 22:28     ` James E Wilson
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Steven Bosscher @ 2005-06-30 21:07 UTC (permalink / raw)
  To: Andrey Belevantsev; +Cc: Vladimir Makarov, rearnsha, gcc

On Thursday 30 June 2005 16:18, Andrey Belevantsev wrote:
> Vladimir Makarov wrote:
> > I'll look at this PR today.
>
> We've looked today at this issue. We think the problem is that proposed
> patch of sched_get_condition() treats conditional jumps likely to
> COND_EXECs, but it doesn't fix other places in sched-deps, where
> COND_EXECs are considered. Maxim Kuvyrkov proposed the attached patch,
> which allows gcc to bootstrap on ia64 and fixes the testcase in PR.

Yeah, it fixes the test case and still allows the motion of the
cond_exec insns past the branch.  It probably doesn't fix Richard
Earnshaw's problem (which is IMHO a separate problem), but at
least for this case the patch appears to do the right thing.

Of course I can't approve it.  But I diffed the output of the
patched compiler against the unpatched one.  Looks OK.  It is
interesting to see that the "(p6)" checks are in front of those
insns, when in fact we already know that p6 must be true if the
branch is taken (because in gcc, p7 not set implies p6 set ;-).
Are predicate checks free, or should there be some post-pass to
clean up this kind of useless predication?

Thanks for looking at this bug.

Gr.
Steven



	.proc bar#				.proc bar#
bar:					bar:
	.prologue				.prologue
	.body					.body
	.mmi					.mmi
	addl r18 = @ltoffx(b#), r1		addl r18 = @ltoffx(b#), r1
	;;					;;
	ld8.mov r18 = [r18], b#			ld8.mov r18 = [r18], b#
	nop 0					nop 0
	;;					;;
	.mmi			      |		.mii
	ld4 r15 = [r18]				ld4 r15 = [r18]
				      >		nop 0
	;;					;;
	cmp4.ne p6, p7 = 0, r15			cmp4.ne p6, p7 = 0, r15
				      >		.mmb
	nop 0					nop 0
				      >		nop 0
				      >		(p7) br.cond.dpnt .L4
	;;					;;
	.mmi					.mmi
				      >		(p6) mov r16 = r0
	(p6) mov r14 = r32			(p6) mov r14 = r32
	(p6) mov r17 = r0			(p6) mov r17 = r0
	(p6) mov r16 = r0	      <
	.mib			      <
	nop 0			      <
	nop 0			      <
	(p7) br.cond.dpnt .L4	      <
	;;					;;
	.mmi					.mmi
	adds r16 = 1, r16			adds r16 = 1, r16
	st4 [r14] = r17				st4 [r14] = r17
	mov r17 = r0				mov r17 = r0
	;;					;;
	.mmi					.mmi
	addp4 r14 = r16, r0			addp4 r14 = r16, r0
	ld4 r15 = [r18]				ld4 r15 = [r18]
	cmp4.ne p8, p9 = 0, r16			cmp4.ne p8, p9 = 0, r16
	;;					;;
	.mib					.mib
	shladd r14 = r14, 2, r32		shladd r14 = r14, 2, r32
	cmp4.gtu p10, p11 = r15, r16		cmp4.gtu p10, p11 = r15, r16
	(p11) br.cond.dpnt .L4			(p11) br.cond.dpnt .L4
	;;					;;
.L6:					.L6:
	.pred.rel.mutex p8, p9			.pred.rel.mutex p8, p9
[.L7:]					[.L7:]
	.mmi					.mmi
	(p8) ld4 r17 = [r14]			(p8) ld4 r17 = [r14]
	nop 0					nop 0
	adds r16 = 1, r16			adds r16 = 1, r16
	;;					;;
	.mmi					.mmi
	st4 [r14] = r17				st4 [r14] = r17
	addp4 r14 = r16, r0			addp4 r14 = r16, r0
	mov r17 = r0				mov r17 = r0
	.mmi					.mmi
	cmp4.ne p8, p9 = 0, r16			cmp4.ne p8, p9 = 0, r16
	ld4 r15 = [r18]				ld4 r15 = [r18]
	nop 0					nop 0
	;;					;;
	.mib					.mib
	shladd r14 = r14, 2, r32		shladd r14 = r14, 2, r32
	cmp4.gtu p10, p11 = r15, r16		cmp4.gtu p10, p11 = r15, r16
	(p10) br.cond.dptk .L6			(p10) br.cond.dptk .L6
.L4:					.L4:
	.mii					.mii
	nop 0					nop 0
	addp4 r3 = r15, r0			addp4 r3 = r15, r0
	;;					;;
	shladd r2 = r3, 2, r32			shladd r2 = r3, 2, r32
	;;					;;
	.mib					.mib
	ld4 r8 = [r2]				ld4 r8 = [r2]
	nop 0					nop 0
	br.ret.sptk.many b0			br.ret.sptk.many b0
	.endp bar#				.endp bar#


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 21:07     ` Steven Bosscher
@ 2005-06-30 22:11       ` James E Wilson
  0 siblings, 0 replies; 19+ messages in thread
From: James E Wilson @ 2005-06-30 22:11 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Vladimir Makarov, rearnsha, gcc

Steven Bosscher wrote:
> Are predicate checks free, or should there be some post-pass to
> clean up this kind of useless predication?

On IA-64, predicate checks are free in terms of run-time.  There is the 
problem that unnecessary predicate checks might increase register 
lifetimes, causing extra register pressure.  However, this would matter 
only if you are going to do some register allocation after eliminating 
the now unnecessary predicate checks, and that doesn't seem to be an 
issue here.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 14:19   ` Andrey Belevantsev
  2005-06-30 20:29     ` Steven Bosscher
  2005-06-30 21:07     ` Steven Bosscher
@ 2005-06-30 22:28     ` James E Wilson
  2005-06-30 23:06     ` Vladimir Makarov
  2005-07-05 18:30     ` Vladimir Makarov
  4 siblings, 0 replies; 19+ messages in thread
From: James E Wilson @ 2005-06-30 22:28 UTC (permalink / raw)
  To: Andrey Belevantsev; +Cc: Steven Bosscher, rearnsha, gcc

Andrey Belevantsev wrote:
> We've also found that current mainline ICEs compiling the testcase with 
> "-O0 -fschedule-insns -fschedule-insns2".

I suspect this is a bug in ia64_reorg in ia64.c.  It shouldn't be trying 
to schedule during a non-optimizing compile.  So the line
   if (ia64_flag_schedule_insns2)
should be
   if (optimize && ia64_flag_schedule_insns2)
That gives us the desired affect that -fschedule-insns2 only works when 
optimizing.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 14:19   ` Andrey Belevantsev
                       ` (2 preceding siblings ...)
  2005-06-30 22:28     ` James E Wilson
@ 2005-06-30 23:06     ` Vladimir Makarov
  2005-06-30 23:26       ` Steven Bosscher
  2005-06-30 23:29       ` Steven Bosscher
  2005-07-05 18:30     ` Vladimir Makarov
  4 siblings, 2 replies; 19+ messages in thread
From: Vladimir Makarov @ 2005-06-30 23:06 UTC (permalink / raw)
  To: Andrey Belevantsev; +Cc: Steven Bosscher, rearnsha, gcc

Andrey Belevantsev wrote:

> Vladimir Makarov wrote:
>
>> I'll look at this PR today.
>
>
> We've looked today at this issue. We think the problem is that 
> proposed patch of sched_get_condition() treats conditional jumps 
> likely to COND_EXECs, but it doesn't fix other places in sched-deps, 
> where COND_EXECs are considered. Maxim Kuvyrkov proposed the attached 
> patch, which allows gcc to bootstrap on ia64 and fixes the testcase in 
> PR.
>
It seems like processing cond-exec was added long ago in 2000 with 
keeping Itanium only in mind.  But other targets with conditional 
execution have other forms of jump insns.  So Richard Earnshaw made a 
quick fix a year ago.

On the first glance the patch looks ok for me but I am going to 
investigate this problem in more details.  So probably I'll approve it 
on the next week (sorry we have canadian holiday tomorrow).

Vlad


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 23:06     ` Vladimir Makarov
@ 2005-06-30 23:26       ` Steven Bosscher
  2005-06-30 23:29       ` Steven Bosscher
  1 sibling, 0 replies; 19+ messages in thread
From: Steven Bosscher @ 2005-06-30 23:26 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Andrey Belevantsev, rearnsha, gcc

On Friday 01 July 2005 01:05, Vladimir Makarov wrote:
> Andrey Belevantsev wrote:
> > Vladimir Makarov wrote:
> >> I'll look at this PR today.
> >
> > We've looked today at this issue. We think the problem is that
> > proposed patch of sched_get_condition() treats conditional jumps
> > likely to COND_EXECs, but it doesn't fix other places in sched-deps,
> > where COND_EXECs are considered. Maxim Kuvyrkov proposed the attached
> > patch, which allows gcc to bootstrap on ia64 and fixes the testcase in
> > PR.
>
> It seems like processing cond-exec was added long ago in 2000 with
> keeping Itanium only in mind.  But other targets with conditional
> execution have other forms of jump insns.  So Richard Earnshaw made a
> quick fix a year ago.

No.  In fact Richard fixed code that can't possibly have worked.  Ever.
And when he fixed it, sched_get_condition started working for the first
time, which of course uncovered a bunch of bugs latent up till then.

Gr.
Steven

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 23:06     ` Vladimir Makarov
  2005-06-30 23:26       ` Steven Bosscher
@ 2005-06-30 23:29       ` Steven Bosscher
  1 sibling, 0 replies; 19+ messages in thread
From: Steven Bosscher @ 2005-06-30 23:29 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Andrey Belevantsev, rearnsha, gcc

On Friday 01 July 2005 01:05, Vladimir Makarov wrote:
> Andrey Belevantsev wrote:
> > Vladimir Makarov wrote:
> >> I'll look at this PR today.
> >
> > We've looked today at this issue. We think the problem is that
> > proposed patch of sched_get_condition() treats conditional jumps
> > likely to COND_EXECs, but it doesn't fix other places in sched-deps,
> > where COND_EXECs are considered. Maxim Kuvyrkov proposed the attached
> > patch, which allows gcc to bootstrap on ia64 and fixes the testcase in
> > PR.
>
> It seems like processing cond-exec was added long ago in 2000 with
> keeping Itanium only in mind.  But other targets with conditional
> execution have other forms of jump insns.  So Richard Earnshaw made a
> quick fix a year ago.
>
> On the first glance the patch looks ok for me but I am going to
> investigate this problem in more details.  So probably I'll approve it
> on the next week (sorry we have canadian holiday tomorrow).

BTW this is what I have so far to fix Richard Earnshaw's problem.  It
just adds dependencies on all cond_exec insns to the jump at the end of
a basic block when doing intra-block scheduling.  The add_dependence
changes are necessary because it was literally impossible to add this
kind of dependency.  add_dependence simply refuses to add dependencies
with mutually exlusive conditions.

I have not even tried to compile this yet, this is just to give you an
idea of what I want to do.  Does it look like a sane approach?

Gr.
Steven

Index: sched-deps.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-deps.c,v
retrieving revision 1.93
diff -u -3 -p -r1.93 sched-deps.c
--- sched-deps.c	25 Jun 2005 02:01:03 -0000	1.93
+++ sched-deps.c	30 Jun 2005 23:24:34 -0000
@@ -171,6 +171,7 @@ sched_get_condition (rtx insn)
   return 0;
 }
 
+\f
 /* Return nonzero if conditions COND1 and COND2 can never be both true.  */
 
 static int
@@ -184,6 +185,30 @@ conditions_mutex_p (rtx cond1, rtx cond2
     return 1;
   return 0;
 }
+
+/* Return true if insn1 and insn2 can never depend on one another because
+   the conditions under which they are executed are mutually exclusive.  */
+bool
+sched_insns_conditions_mutex_p (rtx insn1, rtx insn2)
+{
+  /* flow.c doesn't handle conditional lifetimes entirely correctly;
+     calls mess up the conditional lifetimes.  */
+  if (!CALL_P (insn) && !CALL_P (elem))
+    {
+      cond1 = sched_get_condition (insn);
+      cond2 = sched_get_condition (elem);
+      if (cond1 && cond2
+	  && conditions_mutex_p (cond1, cond2)
+	  /* Make sure first instruction doesn't affect condition of second
+	     instruction if switched.  */
+	  && !modified_in_p (cond1, elem)
+	  /* Make sure second instruction doesn't affect condition of first
+	     instruction if switched.  */
+	  && !modified_in_p (cond2, insn))
+	return true;
+    }
+  return false;
+}
 \f
 /* Add ELEM wrapped in an INSN_LIST with reg note kind DEP_TYPE to the
    LOG_LINKS of INSN, if not already there.  DEP_TYPE indicates the
@@ -207,26 +232,6 @@ add_dependence (rtx insn, rtx elem, enum
   if (NOTE_P (elem))
     return 0;
 
-  /* flow.c doesn't handle conditional lifetimes entirely correctly;
-     calls mess up the conditional lifetimes.  */
-  /* ??? add_dependence is the wrong place to be eliding dependencies,
-     as that forgets that the condition expressions themselves may
-     be dependent.  */
-  if (!CALL_P (insn) && !CALL_P (elem))
-    {
-      cond1 = sched_get_condition (insn);
-      cond2 = sched_get_condition (elem);
-      if (cond1 && cond2
-	  && conditions_mutex_p (cond1, cond2)
-	  /* Make sure first instruction doesn't affect condition of second
-	     instruction if switched.  */
-	  && !modified_in_p (cond1, elem)
-	  /* Make sure second instruction doesn't affect condition of first
-	     instruction if switched.  */
-	  && !modified_in_p (cond2, insn))
-	return 0;
-    }
-
   present_p = 1;
 #ifdef INSN_SCHEDULING
   /* ??? No good way to tell from here whether we're doing interblock
@@ -348,7 +353,10 @@ static void
 add_dependence_list (rtx insn, rtx list, enum reg_note dep_type)
 {
   for (; list; list = XEXP (list, 1))
-    add_dependence (insn, XEXP (list, 0), dep_type);
+    {
+      if (! sched_insns_conditions_mutex_p (insn, XEXP (list, 0)))
+	add_dependence (insn, XEXP (list, 0), dep_type);
+    }
 }
 
 /* Similar, but free *LISTP at the same time.  */
@@ -360,7 +368,8 @@ add_dependence_list_and_free (rtx insn, 
   for (list = *listp, *listp = NULL; list ; list = next)
     {
       next = XEXP (list, 1);
-      add_dependence (insn, XEXP (list, 0), dep_type);
+      if (! sched_insns_conditions_mutex_p (insn, XEXP (list, 0)))
+	add_dependence (insn, XEXP (list, 0), dep_type);
       free_INSN_LIST_node (list);
     }
 }
@@ -393,7 +402,7 @@ delete_all_dependences (rtx insn)
 static void
 fixup_sched_groups (rtx insn)
 {
-  rtx link;
+  rtx link, prev_nonnote;
 
   for (link = LOG_LINKS (insn); link ; link = XEXP (link, 1))
     {
@@ -405,14 +414,17 @@ fixup_sched_groups (rtx insn)
 	  if (XEXP (link, 0) == i)
 	    goto next_link;
 	} while (SCHED_GROUP_P (i));
-      add_dependence (i, XEXP (link, 0), REG_NOTE_KIND (link));
+      if (! sched_insns_conditions_mutex_p (i, XEXP (link, 0)))
+	add_dependence (i, XEXP (link, 0), REG_NOTE_KIND (link));
     next_link:;
     }
 
   delete_all_dependences (insn);
 
-  if (BLOCK_FOR_INSN (insn) == BLOCK_FOR_INSN (prev_nonnote_insn (insn)))
-    add_dependence (insn, prev_nonnote_insn (insn), REG_DEP_ANTI);
+  prev_nonnote = prev_nonnote_insn (insn);
+  if (BLOCK_FOR_INSN (insn) == BLOCK_FOR_INSN (prev_nonnote)
+      && ! sched_insns_conditions_mutex_p (insn, prev_nonnote))
+    add_dependence (insn, prev_nonnote, REG_DEP_ANTI);
 }
 \f
 /* Process an insn's memory dependencies.  There are four kinds of
@@ -620,7 +632,8 @@ sched_analyze_1 (struct deps *deps, rtx 
 	  pending_mem = deps->pending_read_mems;
 	  while (pending)
 	    {
-	      if (anti_dependence (XEXP (pending_mem, 0), t))
+	      if (anti_dependence (XEXP (pending_mem, 0), t)
+		  && ! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
 		add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
 
 	      pending = XEXP (pending, 1);
@@ -631,7 +644,8 @@ sched_analyze_1 (struct deps *deps, rtx 
 	  pending_mem = deps->pending_write_mems;
 	  while (pending)
 	    {
-	      if (output_dependence (XEXP (pending_mem, 0), t))
+	      if (output_dependence (XEXP (pending_mem, 0), t)
+		  && ! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
 		add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
 
 	      pending = XEXP (pending, 1);
@@ -759,7 +773,8 @@ sched_analyze_2 (struct deps *deps, rtx 
 	pending_mem = deps->pending_read_mems;
 	while (pending)
 	  {
-	    if (read_dependence (XEXP (pending_mem, 0), t))
+	    if (read_dependence (XEXP (pending_mem, 0), t)
+		&& ! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
 	      add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
 
 	    pending = XEXP (pending, 1);
@@ -771,16 +786,17 @@ sched_analyze_2 (struct deps *deps, rtx 
 	while (pending)
 	  {
 	    if (true_dependence (XEXP (pending_mem, 0), VOIDmode,
-				 t, rtx_varies_p))
-	      add_dependence (insn, XEXP (pending, 0), 0);
+				 t, rtx_varies_p)
+		&& ! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
+	      add_dependence (insn, XEXP (pending, 0), REG_DEP_TRUE);
 
 	    pending = XEXP (pending, 1);
 	    pending_mem = XEXP (pending_mem, 1);
 	  }
 
 	for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
-	  if (!JUMP_P (XEXP (u, 0))
-	      || deps_may_trap_p (x))
+	  if ((! JUMP_P (XEXP (u, 0)) || deps_may_trap_p (x))
+	      && ! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
 	    add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
 
 	/* Always add these dependencies to pending_reads, since
@@ -966,7 +982,8 @@ sched_analyze_insn (struct deps *deps, r
 	  pending_mem = deps->pending_write_mems;
 	  while (pending)
 	    {
-	      add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
+	      if (! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
+		add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
 	      pending = XEXP (pending, 1);
 	      pending_mem = XEXP (pending_mem, 1);
 	    }
@@ -975,7 +992,8 @@ sched_analyze_insn (struct deps *deps, r
 	  pending_mem = deps->pending_read_mems;
 	  while (pending)
 	    {
-	      if (MEM_VOLATILE_P (XEXP (pending_mem, 0)))
+	      if (MEM_VOLATILE_P (XEXP (pending_mem, 0))
+		  && ! sched_insns_conditions_mutex_p (insn, XEXP (pending, 0)))
 		add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
 	      pending = XEXP (pending, 1);
 	      pending_mem = XEXP (pending_mem, 1);
Index: sched-ebb.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-ebb.c,v
retrieving revision 1.44
diff -u -3 -p -r1.44 sched-ebb.c
--- sched-ebb.c	25 Jun 2005 02:01:03 -0000	1.44
+++ sched-ebb.c	30 Jun 2005 23:24:34 -0000
@@ -454,7 +454,8 @@ add_deps_for_risky_insns (rtx head, rtx 
 	    /* We can not change the mode of the backward
 	       dependency because REG_DEP_ANTI has the lowest
 	       rank.  */
-	    if (add_dependence (insn, prev, REG_DEP_ANTI))
+	    if (! sched_insns_conditions_mutex_p (insn, prev)
+		&& add_dependence (insn, prev, REG_DEP_ANTI))
 	      add_forward_dependence (prev, insn, REG_DEP_ANTI);
             break;
 
Index: sched-int.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-int.h,v
retrieving revision 1.38
diff -u -3 -p -r1.38 sched-int.h
--- sched-int.h	25 Jun 2005 02:01:03 -0000	1.38
+++ sched-int.h	30 Jun 2005 23:24:35 -0000
@@ -331,6 +331,7 @@ enum INSN_TRAP_CLASS
 extern void print_insn (char *, rtx, int);
 
 /* Functions in sched-deps.c.  */
+extern bool sched_insns_conditions_mutex_p (rtx, rtx);
 extern int add_dependence (rtx, rtx, enum reg_note);
 extern void sched_analyze (struct deps *, rtx, rtx);
 extern void init_deps (struct deps *);
Index: sched-rgn.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-rgn.c,v
retrieving revision 1.96
diff -u -3 -p -r1.96 sched-rgn.c
--- sched-rgn.c	25 Jun 2005 02:01:03 -0000	1.96
+++ sched-rgn.c	30 Jun 2005 23:24:35 -0000
@@ -1881,6 +1881,8 @@ add_branch_dependences (rtx head, rtx ta
      cc0 setters remain at the end because they can't be moved away from
      their cc0 user.
 
+     COND_EXEC insns cannot be moved past a branch (see e.g. PR17808).
+
      Insns setting CLASS_LIKELY_SPILLED_P registers (usually return values)
      are not moved before reload because we can wind up with register
      allocation failures.  */
@@ -1904,7 +1906,8 @@ add_branch_dependences (rtx head, rtx ta
 	{
 	  if (last != 0 && !find_insn_list (insn, LOG_LINKS (last)))
 	    {
-	      add_dependence (last, insn, REG_DEP_ANTI);
+	      if (! sched_insns_conditions_mutex_p (last, insn))
+		add_dependence (last, insn, REG_DEP_ANTI);
 	      INSN_REF_COUNT (insn)++;
 	    }
 
@@ -1930,9 +1933,61 @@ add_branch_dependences (rtx head, rtx ta
 	if (INSN_REF_COUNT (insn) != 0)
 	  continue;
 
-	add_dependence (last, insn, REG_DEP_ANTI);
+	if (! sched_insns_conditions_mutex_p (last, insn))
+	  add_dependence (last, insn, REG_DEP_ANTI);
 	INSN_REF_COUNT (insn) = 1;
       }
+
+#ifdef HAVE_conditional_execution
+  /* Finally, if the block ends in a jump, and we are doing intra-block
+     scheduling, make sure that the branch depends on any COND_EXEC insns
+     inside the block to avoid moving the COND_EXECs past the branch insn.
+
+     We only have to do this after reload, because (1) before reload there
+     are no COND_EXEC insns, and (2) the region scheduler is an intra-block
+     scheduler after reload.
+
+     FIXME: We could in some cases move COND_EXEC insns past the branch if
+     this scheduler would be a little smarter.  Consider this code:
+
+		T = [addr]
+	C  ?	addr += 4
+	C  ?	X += 12
+	C  ?	T += 1
+	!C ?	jump foo
+
+     On a target with a one cycle stall on a memory access the optimal
+     sequence would be:
+
+		T = [addr]
+	C  ?	addr += 4
+	C  ?	T += 1
+	C  ?	jump foo
+	!C ?	X += 12
+
+     We don't want to put the 'X += 12' before the branch because it just
+     wastes a cycle of execution time when the branch is taken.
+
+     Note that in the example "!C" will always be true.  That is another
+     possible improvement for handling COND_EXECs in this scheduler: it
+     could remove always-true predicates.  */
+
+  if (!reload_completed || ! JUMP_P (tail))
+    return;
+
+  insn = PREV_INSN (tail);
+  while (insn)
+    {
+      /* Note that we want to add this dependency even when
+	 sched_insns_conditions_mutex_p returns true.  The whole point
+	 is that we _want_ this dependency, even if these insns really
+	 are independent.  */
+      if (GET_CODE (PATTERN (insn)) == COND_EXEC)
+	add_dependence (tail, insn, REG_DEP_ANTI);
+
+      insn = PREV_INSN (insn);
+    }
+#endif
 }
 
 /* Data structures for the computation of data dependences in a regions.  We

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-06-30 14:19   ` Andrey Belevantsev
                       ` (3 preceding siblings ...)
  2005-06-30 23:06     ` Vladimir Makarov
@ 2005-07-05 18:30     ` Vladimir Makarov
  2005-07-05 18:57       ` Steven Bosscher
  4 siblings, 1 reply; 19+ messages in thread
From: Vladimir Makarov @ 2005-07-05 18:30 UTC (permalink / raw)
  To: Andrey Belevantsev; +Cc: Steven Bosscher, rearnsha, gcc

Andrey Belevantsev wrote:

> Vladimir Makarov wrote:
>
>> I'll look at this PR today.
>
>
> We've looked today at this issue. We think the problem is that 
> proposed patch of sched_get_condition() treats conditional jumps 
> likely to COND_EXECs, but it doesn't fix other places in sched-deps, 
> where COND_EXECs are considered. Maxim Kuvyrkov proposed the attached 
> patch, which allows gcc to bootstrap on ia64 and fixes the testcase in 
> PR.

This patch is ok to fix PR17808.  You just need to remove conditional 
#if and Richard's comment in sched_get_condition.

>
> We've also found that current mainline ICEs compiling the testcase 
> with "-O0 -fschedule-insns -fschedule-insns2". That is because after 
> reload several pseudos still remain in global_live_at_start sets. The 
> pseudos then appear in regsets through 
> compute_jump_reg_dependencies(), and sched-deps segfaults at 
> EXECUTE_IF_SET_IN_REG_SET loop at sched-deps.c:948.
>
> We don't know reload well enough to know for sure which place should 
> be fixed in reload, or maybe in update_life_info(). Is this issue 
> worth opening another PR?

As Jim Wilson wrote some combinations of options for IA64 are not 
supposed to work since starting the port implementation.

>
> Andrey
>
>
>
>
>------------------------------------------------------------------------
>
>--- gcc/gcc/sched-deps.c	Sun Jun 19 16:37:49 2005
>+++ orig/gcc/sched-deps.c	Thu Jun 30 18:00:23 2005
>@@ -149,7 +149,7 @@
>     return 0;
> 
>   src = SET_SRC (pc_set (insn));
>-#if 0
>+#if 1
>   /* The previous code here was completely invalid and could never extract
>      the condition from a jump.  This code does the correct thing, but that
>      triggers latent bugs later in the scheduler on ports with conditional
>@@ -1019,7 +1019,8 @@
>     {
>       /* In the case of barrier the most added dependencies are not
>          real, so we use anti-dependence here.  */
>-      if (GET_CODE (PATTERN (insn)) == COND_EXEC)
>+      /* if (GET_CODE (PATTERN (insn)) == COND_EXEC)  */
>+      if (sched_get_condition (insn))
> 	{
> 	  EXECUTE_IF_SET_IN_REG_SET (&deps->reg_last_in_use, 0, i, rsi)
> 	    {
>@@ -1066,7 +1067,8 @@
>     {
>       /* If the current insn is conditional, we can't free any
> 	 of the lists.  */
>-      if (GET_CODE (PATTERN (insn)) == COND_EXEC)
>+      /* if (GET_CODE (PATTERN (insn)) == COND_EXEC)  */
>+      if (sched_get_condition (insn))
> 	{
> 	  EXECUTE_IF_SET_IN_REG_SET (reg_pending_uses, 0, i, rsi)
> 	    {
>  
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scheduler questions (related to PR17808)
  2005-07-05 18:30     ` Vladimir Makarov
@ 2005-07-05 18:57       ` Steven Bosscher
  0 siblings, 0 replies; 19+ messages in thread
From: Steven Bosscher @ 2005-07-05 18:57 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Andrey Belevantsev, rearnsha, gcc

On Tuesday 05 July 2005 20:30, Vladimir Makarov wrote:
> Andrey Belevantsev wrote:
> > Vladimir Makarov wrote:
> >> I'll look at this PR today.
> >
> > We've looked today at this issue. We think the problem is that
> > proposed patch of sched_get_condition() treats conditional jumps
> > likely to COND_EXECs, but it doesn't fix other places in sched-deps,
> > where COND_EXECs are considered. Maxim Kuvyrkov proposed the attached
> > patch, which allows gcc to bootstrap on ia64 and fixes the testcase in
> > PR.
>
> This patch is ok to fix PR17808.  You just need to remove conditional
> #if and Richard's comment in sched_get_condition.

Don't do that just yet.

Vlad, you have missed that PR17808 is in fact two separate bugs.  That
is in part my mistake, I should have opened a new PR for the missing
dependencies on ia64.  But I thought it was obvious that we really have
two separate bugs here.

There is no code in sched-rgn to keep the jump at the end in the case
that the COND_EXEC_COND of a cond_exec insn and the jump condition of a
jump_insn are mutually exclusive.  My patch that you just dismissed as
not necessary pretty clearly explained that, and I'm not sure why you
just ignored my explanation.

If you remove that #if 0 from sched_get_condition, you will break all
cond_exec targets except ia64.

Gr.
Steven

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2005-07-05 18:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-06-29 22:46 Scheduler questions (related to PR17808) Steven Bosscher
2005-06-29 23:30 ` Dale Johannesen
2005-06-30  9:32 ` Richard Earnshaw
2005-06-30 10:22   ` Steven Bosscher
2005-06-30 10:53     ` Richard Earnshaw
2005-06-30 11:57       ` Steven Bosscher
2005-06-30 13:50 ` Vladimir Makarov
2005-06-30 14:19   ` Andrey Belevantsev
2005-06-30 20:29     ` Steven Bosscher
2005-06-30 21:07     ` Steven Bosscher
2005-06-30 22:11       ` James E Wilson
2005-06-30 22:28     ` James E Wilson
2005-06-30 23:06     ` Vladimir Makarov
2005-06-30 23:26       ` Steven Bosscher
2005-06-30 23:29       ` Steven Bosscher
2005-07-05 18:30     ` Vladimir Makarov
2005-07-05 18:57       ` Steven Bosscher
2005-06-30 14:20 ` Mostafa Hagog
2005-06-30 16:47   ` Steven Bosscher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).