public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* DFA scheduling bug?
@ 2003-03-25 17:16 Dan Towner
  2003-03-25 18:32 ` Vladimir Makarov
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Towner @ 2003-03-25 17:16 UTC (permalink / raw)
  To: gcc

Hi all,

I have encountered a problem with the DFA scheduler, for a 16-bit DSP
port of gcc, in which an invalid schedule is generated.

At the end of a function, the expand_epilogue pattern is used to load
the return address from memory, into a register, and then jump to the
location in that register. It looks like this, immediately before the
DFA scheduling pass (in bbro):

(insn/f 58 57 59 2 (nil) (set (reg:HI 12 R12)
         (mem:HI (plus:HI (reg/f:HI 13 FP)
                 (const_int 0 [0x0])) [0 S2 A16])) 15 {movhi} (nil)
     (nil))

(jump_insn 59 58 60 2 (nil) (parallel [
             (return)
             (use (reg:HI 12 R12))
         ]) 6 {*fn_return} (insn_list 58 (nil))
     (expr_list:REG_DEAD (reg:HI 12 R12)
         (nil)))

Note that the return address is loaded from a stack location into R12,
and then the return instructions uses R12. After the DFA scheduling
pass, this code fragment has been converted into the following:

(insn/f 58 57 34 2 (nil) (set (reg:HI 12 R12)
         (mem:HI (plus:HI (reg/f:HI 13 FP)
                 (const_int 0 [0x0])) [0 S2 A16])) 15 {movhi} (nil)
     (nil))

(insn 34 58 59 2 (nil) (use (reg/i:HI 0 R0 [ <result> ])) -1 (insn_list 
31 (nil))
     (nil))

(jump_insn 59 34 60 2 (nil) (parallel [
             (return)
             (use (reg:HI 12 R12))
         ]) 6 {*fn_return} (insn_list:REG_DEP_ANTI 58 
(insn_list:REG_DEP_ANTI 31 (insn_list:REG_DEP_ANTI 34 (nil))))
     (expr_list:REG_DEAD (reg:HI 12 R12)
         (nil)))

And the instructions have been scheduled as:

;;        0--> 58   R12=[FP+0x0]                       :slot1,nothing
;;        0--> 34   use R0                             :nothing
;;        0--> 59   {return;use R12;}                  :slot2

It appears that the use of the register R12, and the loading from
memory into that register are being treated as operations which can
occur in parallel - hence the antidependency that has appeared between
the instructions. This used to work correctly, but has come about
after upgrading my code base to gcc 3.4 mainline (from gcc 3.2
release).

Any ideas what might be going wrong?

thanks,

dan.

=============================================================================
Daniel Towner
picoChip Designs Ltd., Riverside Buildings, 108, Walcot Street, BATH, 
BA1 5BG
dant@picochip.com
07786 702589


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: DFA scheduling bug?
  2003-03-25 17:16 DFA scheduling bug? Dan Towner
@ 2003-03-25 18:32 ` Vladimir Makarov
  2003-03-27 17:55   ` Vladimir Makarov
  0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Makarov @ 2003-03-25 18:32 UTC (permalink / raw)
  To: Dan Towner; +Cc: gcc

Dan Towner wrote:
> 
> Hi all,
> 
> I have encountered a problem with the DFA scheduler, for a 16-bit DSP
> port of gcc, in which an invalid schedule is generated.
> 
> At the end of a function, the expand_epilogue pattern is used to load
> the return address from memory, into a register, and then jump to the
> location in that register. It looks like this, immediately before the
> DFA scheduling pass (in bbro):
> 
> (insn/f 58 57 59 2 (nil) (set (reg:HI 12 R12)
>          (mem:HI (plus:HI (reg/f:HI 13 FP)
>                  (const_int 0 [0x0])) [0 S2 A16])) 15 {movhi} (nil)
>      (nil))
> 
> (jump_insn 59 58 60 2 (nil) (parallel [
>              (return)
>              (use (reg:HI 12 R12))
>          ]) 6 {*fn_return} (insn_list 58 (nil))
>      (expr_list:REG_DEAD (reg:HI 12 R12)
>          (nil)))
> 
> Note that the return address is loaded from a stack location into R12,
> and then the return instructions uses R12. After the DFA scheduling
> pass, this code fragment has been converted into the following:
> 
> (insn/f 58 57 34 2 (nil) (set (reg:HI 12 R12)
>          (mem:HI (plus:HI (reg/f:HI 13 FP)
>                  (const_int 0 [0x0])) [0 S2 A16])) 15 {movhi} (nil)
>      (nil))
> 
> (insn 34 58 59 2 (nil) (use (reg/i:HI 0 R0 [ <result> ])) -1 (insn_list
> 31 (nil))
>      (nil))
> 
> (jump_insn 59 34 60 2 (nil) (parallel [
>              (return)
>              (use (reg:HI 12 R12))
>          ]) 6 {*fn_return} (insn_list:REG_DEP_ANTI 58
> (insn_list:REG_DEP_ANTI 31 (insn_list:REG_DEP_ANTI 34 (nil))))
>      (expr_list:REG_DEAD (reg:HI 12 R12)
>          (nil)))
> 
> And the instructions have been scheduled as:
> 
> ;;        0--> 58   R12=[FP+0x0]                       :slot1,nothing
> ;;        0--> 34   use R0                             :nothing
> ;;        0--> 59   {return;use R12;}                  :slot2
> 
> It appears that the use of the register R12, and the loading from
> memory into that register are being treated as operations which can
> occur in parallel - hence the antidependency that has appeared between
> the instructions. This used to work correctly, but has come about
> after upgrading my code base to gcc 3.4 mainline (from gcc 3.2
> release).
> 
> Any ideas what might be going wrong?
> 

It is probably one of my patches.  Could you sent me dump files before
the insn scheduling and after that using -fsched-verbose=5.  I'll look
at that and try to fix it.

Vlad

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: DFA scheduling bug?
  2003-03-25 18:32 ` Vladimir Makarov
@ 2003-03-27 17:55   ` Vladimir Makarov
  0 siblings, 0 replies; 3+ messages in thread
From: Vladimir Makarov @ 2003-03-27 17:55 UTC (permalink / raw)
  To: Dan Towner, gcc

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]

Vladimir Makarov wrote:
> 
> Dan Towner wrote:
> >
> > Hi all,
> >
> > I have encountered a problem with the DFA scheduler, for a 16-bit DSP
> > port of gcc, in which an invalid schedule is generated.
> >
> 
> It is probably one of my patches.  Could you sent me dump files before
> the insn scheduling and after that using -fsched-verbose=5.  I'll look
> at that and try to fix it.

I can not reproduce the problem on other gcc ports.  So could you try
the following patch.  If it solves the problem, I'll submit it to public
gcc.

Thanks in advance,

Vlad

[-- Attachment #2: sched-deps.patch --]
[-- Type: text/plain, Size: 7516 bytes --]

Index: sched-deps.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sched-deps.c,v
retrieving revision 1.56
diff -c -p -r1.56 sched-deps.c
*** sched-deps.c	11 Mar 2003 09:17:38 -0000	1.56
--- sched-deps.c	27 Mar 2003 14:56:17 -0000
*************** static regset_head reg_pending_uses_head
*** 54,60 ****
  static regset reg_pending_sets;
  static regset reg_pending_clobbers;
  static regset reg_pending_uses;
! static bool reg_pending_barrier;
  
  /* To speed up the test for duplicate dependency links we keep a
     record of dependencies created by add_dependence when the average
--- 54,68 ----
  static regset reg_pending_sets;
  static regset reg_pending_clobbers;
  static regset reg_pending_uses;
! 
! enum reg_pending_barrier_mode
! {
!   NOT_A_BARRIER = 0,
!   MOVE_BARRIER,
!   TRUE_BARRIER
! };
! 
! static enum reg_pending_barrier_mode reg_pending_barrier;
  
  /* To speed up the test for duplicate dependency links we keep a
     record of dependencies created by add_dependence when the average
*************** sched_analyze_2 (deps, x, insn)
*** 748,754 ****
  	   mode.  An insn should not be moved across this even if it only uses
  	   pseudo-regs because it might give an incorrectly rounded result.  */
  	if (code != ASM_OPERANDS || MEM_VOLATILE_P (x))
! 	  reg_pending_barrier = true;
  
  	/* For all ASM_OPERANDS, we must traverse the vector of input operands.
  	   We can not just fall through here since then we would be confused
--- 756,762 ----
  	   mode.  An insn should not be moved across this even if it only uses
  	   pseudo-regs because it might give an incorrectly rounded result.  */
  	if (code != ASM_OPERANDS || MEM_VOLATILE_P (x))
! 	  reg_pending_barrier = TRUE_BARRIER;
  
  	/* For all ASM_OPERANDS, we must traverse the vector of input operands.
  	   We can not just fall through here since then we would be confused
*************** sched_analyze_insn (deps, x, insn, loop_
*** 867,873 ****
  	    sched_analyze_2 (deps, XEXP (link, 0), insn);
  	}
        if (find_reg_note (insn, REG_SETJMP, NULL))
! 	reg_pending_barrier = true;
      }
  
    if (GET_CODE (insn) == JUMP_INSN)
--- 875,881 ----
  	    sched_analyze_2 (deps, XEXP (link, 0), insn);
  	}
        if (find_reg_note (insn, REG_SETJMP, NULL))
! 	reg_pending_barrier = MOVE_BARRIER;
      }
  
    if (GET_CODE (insn) == JUMP_INSN)
*************** sched_analyze_insn (deps, x, insn, loop_
*** 875,881 ****
        rtx next;
        next = next_nonnote_insn (insn);
        if (next && GET_CODE (next) == BARRIER)
! 	reg_pending_barrier = true;
        else
  	{
  	  rtx pending, pending_mem;
--- 883,889 ----
        rtx next;
        next = next_nonnote_insn (insn);
        if (next && GET_CODE (next) == BARRIER)
! 	reg_pending_barrier = TRUE_BARRIER;
        else
  	{
  	  rtx pending, pending_mem;
*************** sched_analyze_insn (deps, x, insn, loop_
*** 940,946 ****
  	      || INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_END
  	      || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_BEG
  	      || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_END)
! 	    reg_pending_barrier = true;
  
  	  link = XEXP (link, 1);
  	}
--- 948,954 ----
  	      || INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_END
  	      || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_BEG
  	      || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_END)
! 	    reg_pending_barrier = MOVE_BARRIER;
  
  	  link = XEXP (link, 1);
  	}
*************** sched_analyze_insn (deps, x, insn, loop_
*** 952,958 ****
       where block boundaries fall.  This is mighty confusing elsewhere.
       Therefore, prevent such an instruction from being moved.  */
    if (can_throw_internal (insn))
!     reg_pending_barrier = true;
  
    /* Add dependencies if a scheduling barrier was found.  */
    if (reg_pending_barrier)
--- 960,966 ----
       where block boundaries fall.  This is mighty confusing elsewhere.
       Therefore, prevent such an instruction from being moved.  */
    if (can_throw_internal (insn))
!     reg_pending_barrier = MOVE_BARRIER;
  
    /* Add dependencies if a scheduling barrier was found.  */
    if (reg_pending_barrier)
*************** sched_analyze_insn (deps, x, insn, loop_
*** 965,972 ****
  	    {
  	      struct deps_reg *reg_last = &deps->reg_last[i];
  	      add_dependence_list (insn, reg_last->uses, REG_DEP_ANTI);
! 	      add_dependence_list (insn, reg_last->sets, REG_DEP_ANTI);
! 	      add_dependence_list (insn, reg_last->clobbers, REG_DEP_ANTI);
  	    });
  	}
        else
--- 973,984 ----
  	    {
  	      struct deps_reg *reg_last = &deps->reg_last[i];
  	      add_dependence_list (insn, reg_last->uses, REG_DEP_ANTI);
! 	      add_dependence_list
! 		(insn, reg_last->sets,
! 		 reg_pending_barrier == TRUE_BARRIER ? 0 : REG_DEP_ANTI);
! 	      add_dependence_list
! 		(insn, reg_last->clobbers,
! 		 reg_pending_barrier == TRUE_BARRIER ? 0 : REG_DEP_ANTI);
  	    });
  	}
        else
*************** sched_analyze_insn (deps, x, insn, loop_
*** 976,985 ****
  	      struct deps_reg *reg_last = &deps->reg_last[i];
  	      add_dependence_list_and_free (insn, &reg_last->uses,
  					    REG_DEP_ANTI);
! 	      add_dependence_list_and_free (insn, &reg_last->sets,
! 					    REG_DEP_ANTI);
! 	      add_dependence_list_and_free (insn, &reg_last->clobbers,
! 					    REG_DEP_ANTI);
  	      reg_last->uses_length = 0;
  	      reg_last->clobbers_length = 0;
  	    });
--- 988,999 ----
  	      struct deps_reg *reg_last = &deps->reg_last[i];
  	      add_dependence_list_and_free (insn, &reg_last->uses,
  					    REG_DEP_ANTI);
! 	      add_dependence_list_and_free
! 		(insn, &reg_last->sets,
! 		 reg_pending_barrier == TRUE_BARRIER ? 0 : REG_DEP_ANTI);
! 	      add_dependence_list_and_free
! 		(insn, &reg_last->clobbers,
! 		 reg_pending_barrier == TRUE_BARRIER ? 0 : REG_DEP_ANTI);
  	      reg_last->uses_length = 0;
  	      reg_last->clobbers_length = 0;
  	    });
*************** sched_analyze_insn (deps, x, insn, loop_
*** 993,999 ****
  	}
  
        flush_pending_lists (deps, insn, true, true);
!       reg_pending_barrier = false;
      }
    else
      {
--- 1007,1013 ----
  	}
  
        flush_pending_lists (deps, insn, true, true);
!       reg_pending_barrier = NOT_A_BARRIER;
      }
    else
      {
*************** sched_analyze (deps, head, tail)
*** 1190,1196 ****
  	    {
  	      /* This is setjmp.  Assume that all registers, not just
  		 hard registers, may be clobbered by this call.  */
! 	      reg_pending_barrier = true;
  	    }
  	  else
  	    {
--- 1204,1210 ----
  	    {
  	      /* This is setjmp.  Assume that all registers, not just
  		 hard registers, may be clobbered by this call.  */
! 	      reg_pending_barrier = MOVE_BARRIER;
  	    }
  	  else
  	    {
*************** init_deps_global ()
*** 1505,1511 ****
    reg_pending_sets = INITIALIZE_REG_SET (reg_pending_sets_head);
    reg_pending_clobbers = INITIALIZE_REG_SET (reg_pending_clobbers_head);
    reg_pending_uses = INITIALIZE_REG_SET (reg_pending_uses_head);
!   reg_pending_barrier = false;
  }
  
  /* Free everything used by the dependency analysis code.  */
--- 1519,1525 ----
    reg_pending_sets = INITIALIZE_REG_SET (reg_pending_sets_head);
    reg_pending_clobbers = INITIALIZE_REG_SET (reg_pending_clobbers_head);
    reg_pending_uses = INITIALIZE_REG_SET (reg_pending_uses_head);
!   reg_pending_barrier = NOT_A_BARRIER;
  }
  
  /* Free everything used by the dependency analysis code.  */

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-03-27 15:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-25 17:16 DFA scheduling bug? Dan Towner
2003-03-25 18:32 ` Vladimir Makarov
2003-03-27 17:55   ` Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).