public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Alan Modra <amodra@gmail.com>
To: gcc-patches@gcc.gnu.org, Bernd Schmidt <bernds@codesourcery.com>
Subject: Re: PowerPC shrink-wrap support 3 of 3
Date: Mon, 31 Oct 2011 15:14:00 -0000	[thread overview]
Message-ID: <20111031142722.GX29439@bubble.grove.modra.org> (raw)
In-Reply-To: <20111026122719.GM29439@bubble.grove.modra.org>

So I'm at the point where I'm reasonably happy with this work.  This
patch doesn't do anything particularly clever regarding our
shrink-wrap implementation.  We still only insert one copy of the
prologue, and one of the epilogue in thread_prologue_and_epilogue.
All it really does is replaces Bernd's !last_bb_active code (allowing
one tail block with no active insns to be shared by paths needing a
prologue and paths not needing a prologue), with what I think is
conceptually simpler, duplicating a shared tail block.  Then I extend
this to duplicating a chain of tail blocks.

That leads to some simplification as all the special cases and
restrictions of !last_bb_active disappear.  For example,
convert_jumps_to_returns looks much like the code in gcc-4.6.  We also
get many more functions being shrink-wrapped.  Some numbers from my
latest gcc bootstraps:

powerpc-linux
.../gcc-virgin/gcc> grep 'Performing shrink' *.pro_and_epilogue | wc -l
453
.../gcc-curr/gcc> grep 'Performing shrink' *.pro_and_epilogue | wc -l
648

i686-linux
.../gcc-virgin/gcc$ grep 'Performing shrink' *pro_and_epilogue | wc -l
329
.../gcc-curr/gcc$ grep 'Performing shrink' *.pro_and_epilogue | wc -l
416

Bits left to do
- limit size of duplicated tails
- don't duplicate sibling call blocks, but instead split the block
  after the sibling call epilogue has been added, redirecting
  non-prologue paths past the epilogue.

Is this OK to apply as is?

	* function.c (bb_active_p): Delete.
	(dup_block_and_redirect, active_insn_between): New functions.
	(convert_jumps_to_returns, emit_return_for_exit): New functions,
	split out from..
	(thread_prologue_and_epilogue_insns): ..here.  Delete
	shadowing variables.  Don't do prologue register clobber tests
	when shrink wrapping already failed.  Delete all last_bb_active
	code.  Instead compute tail block candidates for duplicating
	exit path.  Remove these from antic set.  Duplicate tails when
	reached from both blocks needing a prologue/epilogue and
	blocks not needing such.

Index: gcc/function.c
===================================================================
*** gcc/function.c	(revision 180588)
--- gcc/function.c	(working copy)
*************** set_return_jump_label (rtx returnjump)
*** 5514,5535 ****
      JUMP_LABEL (returnjump) = ret_rtx;
  }
  
! /* Return true if BB has any active insns.  */
  static bool
! bb_active_p (basic_block bb)
  {
    rtx label;
  
!   /* Test whether there are active instructions in BB.  */
!   label = BB_END (bb);
!   while (label && !LABEL_P (label))
      {
!       if (active_insn_p (label))
! 	break;
!       label = PREV_INSN (label);
      }
!   return BB_HEAD (bb) != label || !LABEL_P (label);
  }
  
  /* Generate the prologue and epilogue RTL if the machine supports it.  Thread
     this into place with notes indicating where the prologue ends and where
--- 5514,5698 ----
      JUMP_LABEL (returnjump) = ret_rtx;
  }
  
! #ifdef HAVE_simple_return
! /* Create a copy of BB instructions and insert at BEFORE.  Redirect
!    preds of BB to COPY_BB if they don't appear in NEED_PROLOGUE.  */
! static void
! dup_block_and_redirect (basic_block bb, basic_block copy_bb, rtx before,
! 			bitmap_head *need_prologue)
! {
!   edge_iterator ei;
!   edge e;
!   rtx insn = BB_END (bb);
! 
!   /* We know BB has a single successor, so there is no need to copy a
!      simple jump at the end of BB.  */
!   if (simplejump_p (insn))
!     insn = PREV_INSN (insn);
! 
!   start_sequence ();
!   duplicate_insn_chain (BB_HEAD (bb), insn);
!   if (dump_file)
!     {
!       unsigned count = 0;
!       for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
! 	if (active_insn_p (insn))
! 	  ++count;
!       fprintf (dump_file, "Duplicating bb %d to bb %d, %u active insns.\n",
! 	       bb->index, copy_bb->index, count);
!     }
!   insn = get_insns ();
!   end_sequence ();
!   emit_insn_before (insn, before);
! 
!   /* Redirect all the paths that need no prologue into copy_bb.  */
!   for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); )
!     if (!bitmap_bit_p (need_prologue, e->src->index))
!       {
! 	redirect_edge_and_branch_force (e, copy_bb);
! 	continue;
!       }
!     else
!       ei_next (&ei);
! }
! #endif
! 
! #if defined (HAVE_return) || defined (HAVE_simple_return)
! /* Return true if there are any active insns between HEAD and TAIL.  */
  static bool
! active_insn_between (rtx head, rtx tail)
! {
!   while (tail)
!     {
!       if (active_insn_p (tail))
! 	return true;
!       if (tail == head)
! 	return false;
!       tail = PREV_INSN (tail);
!     }
!   return false;
! }
! 
! /* LAST_BB is a block that exits, and empty of active instructions.
!    Examine its predecessors for jumps that can be converted to
!    (conditional) returns.  */
! static VEC (edge, heap) *
! convert_jumps_to_returns (basic_block last_bb, bool simple_p,
! 			  VEC (edge, heap) *unconverted ATTRIBUTE_UNUSED)
  {
+   int i;
+   basic_block bb;
    rtx label;
+   edge_iterator ei;
+   edge e;
+   VEC(basic_block,heap) *src_bbs;
+ 
+   src_bbs = VEC_alloc (basic_block, heap, EDGE_COUNT (last_bb->preds));
+   FOR_EACH_EDGE (e, ei, last_bb->preds)
+     if (e->src != ENTRY_BLOCK_PTR)
+       VEC_quick_push (basic_block, src_bbs, e->src);
  
!   label = BB_HEAD (last_bb);
! 
!   FOR_EACH_VEC_ELT (basic_block, src_bbs, i, bb)
      {
!       rtx jump = BB_END (bb);
! 
!       if (!JUMP_P (jump) || JUMP_LABEL (jump) != label)
! 	continue;
! 
!       e = find_edge (bb, last_bb);
! 
!       /* If we have an unconditional jump, we can replace that
! 	 with a simple return instruction.  */
!       if (simplejump_p (jump))
! 	{
! 	  /* The use of the return register might be present in the exit
! 	     fallthru block.  Either:
! 	     - removing the use is safe, and we should remove the use in
! 	     the exit fallthru block, or
! 	     - removing the use is not safe, and we should add it here.
! 	     For now, we conservatively choose the latter.  Either of the
! 	     2 helps in crossjumping.  */
! 	  emit_use_return_register_into_block (bb);
! 
! 	  emit_return_into_block (simple_p, bb);
! 	  delete_insn (jump);
! 	}
! 
!       /* If we have a conditional jump branching to the last
! 	 block, we can try to replace that with a conditional
! 	 return instruction.  */
!       else if (condjump_p (jump))
! 	{
! 	  rtx dest;
! 
! 	  if (simple_p)
! 	    dest = simple_return_rtx;
! 	  else
! 	    dest = ret_rtx;
! 	  if (!redirect_jump (jump, dest, 0))
! 	    {
! #ifdef HAVE_simple_return
! 	      if (simple_p)
! 		{
! 		  if (dump_file)
! 		    fprintf (dump_file,
! 			     "Failed to redirect bb %d branch.\n", bb->index);
! 		  VEC_safe_push (edge, heap, unconverted, e);
! 		}
! #endif
! 	      continue;
! 	    }
! 
! 	  /* See comment in simplejump_p case above.  */
! 	  emit_use_return_register_into_block (bb);
! 
! 	  /* If this block has only one successor, it both jumps
! 	     and falls through to the fallthru block, so we can't
! 	     delete the edge.  */
! 	  if (single_succ_p (bb))
! 	    continue;
! 	}
!       else
! 	{
! #ifdef HAVE_simple_return
! 	  if (simple_p)
! 	    {
! 	      if (dump_file)
! 		fprintf (dump_file,
! 			 "Failed to redirect bb %d branch.\n", bb->index);
! 	      VEC_safe_push (edge, heap, unconverted, e);
! 	    }
! #endif
! 	  continue;
! 	}
! 
!       /* Fix up the CFG for the successful change we just made.  */
!       redirect_edge_succ (e, EXIT_BLOCK_PTR);
!     }
!   VEC_free (basic_block, heap, src_bbs);
!   return unconverted;
! }
! 
! /* Emit a return insn for the exit fallthru block.  */
! static basic_block
! emit_return_for_exit (edge exit_fallthru_edge, bool simple_p)
! {
!   basic_block last_bb = exit_fallthru_edge->src;
! 
!   if (JUMP_P (BB_END (last_bb)))
!     {
!       last_bb = split_edge (exit_fallthru_edge);
!       exit_fallthru_edge = single_succ_edge (last_bb);
      }
!   emit_barrier_after (BB_END (last_bb));
!   emit_return_into_block (simple_p, last_bb);
!   exit_fallthru_edge->flags &= ~EDGE_FALLTHRU;
!   return last_bb;
  }
+ #endif
+ 
  
  /* Generate the prologue and epilogue RTL if the machine supports it.  Thread
     this into place with notes indicating where the prologue ends and where
*************** static void
*** 5583,5602 ****
  thread_prologue_and_epilogue_insns (void)
  {
    bool inserted;
-   basic_block last_bb;
-   bool last_bb_active ATTRIBUTE_UNUSED;
  #ifdef HAVE_simple_return
!   VEC (rtx, heap) *unconverted_simple_returns = NULL;
!   basic_block simple_return_block_hot = NULL;
!   basic_block simple_return_block_cold = NULL;
    bool nonempty_prologue;
  #endif
!   rtx returnjump ATTRIBUTE_UNUSED;
    rtx seq ATTRIBUTE_UNUSED, epilogue_end ATTRIBUTE_UNUSED;
    rtx prologue_seq ATTRIBUTE_UNUSED, split_prologue_seq ATTRIBUTE_UNUSED;
    edge e, entry_edge, orig_entry_edge, exit_fallthru_edge;
    edge_iterator ei;
-   bitmap_head bb_flags;
  
    df_analyze ();
  
--- 5746,5761 ----
  thread_prologue_and_epilogue_insns (void)
  {
    bool inserted;
  #ifdef HAVE_simple_return
!   VEC (edge, heap) *unconverted_simple_returns = NULL;
    bool nonempty_prologue;
+   bitmap_head bb_flags;
  #endif
!   rtx returnjump;
    rtx seq ATTRIBUTE_UNUSED, epilogue_end ATTRIBUTE_UNUSED;
    rtx prologue_seq ATTRIBUTE_UNUSED, split_prologue_seq ATTRIBUTE_UNUSED;
    edge e, entry_edge, orig_entry_edge, exit_fallthru_edge;
    edge_iterator ei;
  
    df_analyze ();
  
*************** thread_prologue_and_epilogue_insns (void
*** 5614,5631 ****
    entry_edge = single_succ_edge (ENTRY_BLOCK_PTR);
    orig_entry_edge = entry_edge;
  
-   exit_fallthru_edge = find_fallthru_edge (EXIT_BLOCK_PTR->preds);
-   if (exit_fallthru_edge != NULL)
-     {
-       last_bb = exit_fallthru_edge->src;
-       last_bb_active = bb_active_p (last_bb);
-     }
-   else
-     {
-       last_bb = NULL;
-       last_bb_active = false;
-     }
- 
    split_prologue_seq = NULL_RTX;
    if (flag_split_stack
        && (lookup_attribute ("no_split_stack", DECL_ATTRIBUTES (cfun->decl))
--- 5773,5778 ----
*************** thread_prologue_and_epilogue_insns (void
*** 5675,5683 ****
      }
  #endif
  
    bitmap_initialize (&bb_flags, &bitmap_default_obstack);
  
- #ifdef HAVE_simple_return
    /* Try to perform a kind of shrink-wrapping, making sure the
       prologue/epilogue is emitted only around those parts of the
       function that require it.  */
--- 5822,5830 ----
      }
  #endif
  
+ #ifdef HAVE_simple_return
    bitmap_initialize (&bb_flags, &bitmap_default_obstack);
  
    /* Try to perform a kind of shrink-wrapping, making sure the
       prologue/epilogue is emitted only around those parts of the
       function that require it.  */
*************** thread_prologue_and_epilogue_insns (void
*** 5697,5707 ****
        HARD_REG_SET prologue_clobbered, prologue_used, live_on_edge;
        HARD_REG_SET set_up_by_prologue;
        rtx p_insn;
- 
        VEC(basic_block, heap) *vec;
        basic_block bb;
        bitmap_head bb_antic_flags;
        bitmap_head bb_on_list;
  
        if (dump_file)
  	fprintf (dump_file, "Attempting shrink-wrapping optimization.\n");
--- 5844,5854 ----
        HARD_REG_SET prologue_clobbered, prologue_used, live_on_edge;
        HARD_REG_SET set_up_by_prologue;
        rtx p_insn;
        VEC(basic_block, heap) *vec;
        basic_block bb;
        bitmap_head bb_antic_flags;
        bitmap_head bb_on_list;
+       bitmap_head bb_tail;
  
        if (dump_file)
  	fprintf (dump_file, "Attempting shrink-wrapping optimization.\n");
*************** thread_prologue_and_epilogue_insns (void
*** 5726,5737 ****
  
        prepare_shrink_wrap (entry_edge->dest);
  
-       /* That may have inserted instructions into the last block.  */
-       if (last_bb && !last_bb_active)
- 	last_bb_active = bb_active_p (last_bb);
- 
        bitmap_initialize (&bb_antic_flags, &bitmap_default_obstack);
        bitmap_initialize (&bb_on_list, &bitmap_default_obstack);
  
        /* Find the set of basic blocks that require a stack frame.  */
  
--- 5873,5881 ----
  
        prepare_shrink_wrap (entry_edge->dest);
  
        bitmap_initialize (&bb_antic_flags, &bitmap_default_obstack);
        bitmap_initialize (&bb_on_list, &bitmap_default_obstack);
+       bitmap_initialize (&bb_tail, &bitmap_default_obstack);
  
        /* Find the set of basic blocks that require a stack frame.  */
  
*************** thread_prologue_and_epilogue_insns (void
*** 5750,5812 ****
        FOR_EACH_BB (bb)
  	{
  	  rtx insn;
! 	  /* As a special case, check for jumps to the last bb that
! 	     cannot successfully be converted to simple_returns later
! 	     on, and mark them as requiring a frame.  These are
! 	     conditional jumps that jump to their fallthru block, so
! 	     it's not a case that is expected to occur often.  */
! 	  if (JUMP_P (BB_END (bb)) && any_condjump_p (BB_END (bb))
! 	      && single_succ_p (bb)
! 	      && !last_bb_active
! 	      && single_succ (bb) == last_bb)
! 	    {
! 	      bitmap_set_bit (&bb_flags, bb->index);
! 	      VEC_quick_push (basic_block, vec, bb);
! 	    }
! 	  else
! 	    FOR_BB_INSNS (bb, insn)
! 	      if (requires_stack_frame_p (insn, prologue_used,
! 					  set_up_by_prologue))
! 		{
! 		  bitmap_set_bit (&bb_flags, bb->index);
! 		  VEC_quick_push (basic_block, vec, bb);
! 		  break;
! 		}
  	}
  
        /* For every basic block that needs a prologue, mark all blocks
  	 reachable from it, so as to ensure they are also seen as
  	 requiring a prologue.  */
        while (!VEC_empty (basic_block, vec))
  	{
  	  basic_block tmp_bb = VEC_pop (basic_block, vec);
! 	  edge e;
! 	  edge_iterator ei;
  	  FOR_EACH_EDGE (e, ei, tmp_bb->succs)
  	    if (e->dest != EXIT_BLOCK_PTR
  		&& bitmap_set_bit (&bb_flags, e->dest->index))
  	      VEC_quick_push (basic_block, vec, e->dest);
  	}
!       /* If the last basic block contains only a label, we'll be able
! 	 to convert jumps to it to (potentially conditional) return
! 	 insns later.  This means we don't necessarily need a prologue
! 	 for paths reaching it.  */
!       if (last_bb && optimize)
! 	{
! 	  if (!last_bb_active)
! 	    bitmap_clear_bit (&bb_flags, last_bb->index);
! 	  else if (!bitmap_bit_p (&bb_flags, last_bb->index))
! 	    goto fail_shrinkwrap;
  	}
  
        /* Now walk backwards from every block that is marked as needing
! 	 a prologue to compute the bb_antic_flags bitmap.  */
!       bitmap_copy (&bb_antic_flags, &bb_flags);
        FOR_EACH_BB (bb)
  	{
! 	  edge e;
! 	  edge_iterator ei;
! 	  if (!bitmap_bit_p (&bb_flags, bb->index))
  	    continue;
  	  FOR_EACH_EDGE (e, ei, bb->preds)
  	    if (!bitmap_bit_p (&bb_antic_flags, e->src->index)
--- 5894,5947 ----
        FOR_EACH_BB (bb)
  	{
  	  rtx insn;
! 	  FOR_BB_INSNS (bb, insn)
! 	    if (requires_stack_frame_p (insn, prologue_used,
! 					set_up_by_prologue))
! 	      {
! 		bitmap_set_bit (&bb_flags, bb->index);
! 		VEC_quick_push (basic_block, vec, bb);
! 		break;
! 	      }
  	}
  
+       /* Save a copy of blocks that really need a prologue.  */
+       bitmap_copy (&bb_antic_flags, &bb_flags);
+ 
        /* For every basic block that needs a prologue, mark all blocks
  	 reachable from it, so as to ensure they are also seen as
  	 requiring a prologue.  */
        while (!VEC_empty (basic_block, vec))
  	{
  	  basic_block tmp_bb = VEC_pop (basic_block, vec);
! 
  	  FOR_EACH_EDGE (e, ei, tmp_bb->succs)
  	    if (e->dest != EXIT_BLOCK_PTR
  		&& bitmap_set_bit (&bb_flags, e->dest->index))
  	      VEC_quick_push (basic_block, vec, e->dest);
  	}
! 
!       /* Find the set of basic blocks that need no prologue, have a
! 	 single successor, and only go to the exit.  */
!       VEC_quick_push (basic_block, vec, EXIT_BLOCK_PTR);
!       while (!VEC_empty (basic_block, vec))
! 	{
! 	  basic_block tmp_bb = VEC_pop (basic_block, vec);
! 
! 	  FOR_EACH_EDGE (e, ei, tmp_bb->preds)
! 	    if (single_succ_p (e->src)
! 		&& !bitmap_bit_p (&bb_antic_flags, e->src->index)
! 		&& bitmap_set_bit (&bb_tail, e->src->index))
! 	      VEC_quick_push (basic_block, vec, e->src);
  	}
  
        /* Now walk backwards from every block that is marked as needing
! 	 a prologue to compute the bb_antic_flags bitmap.  Exclude
! 	 tail blocks; They can be duplicated to be used on paths not
! 	 needing a prologue.  */
!       bitmap_and_compl (&bb_antic_flags, &bb_flags, &bb_tail);
        FOR_EACH_BB (bb)
  	{
! 	  if (!bitmap_bit_p (&bb_antic_flags, bb->index))
  	    continue;
  	  FOR_EACH_EDGE (e, ei, bb->preds)
  	    if (!bitmap_bit_p (&bb_antic_flags, e->src->index)
*************** thread_prologue_and_epilogue_insns (void
*** 5816,5823 ****
        while (!VEC_empty (basic_block, vec))
  	{
  	  basic_block tmp_bb = VEC_pop (basic_block, vec);
- 	  edge e;
- 	  edge_iterator ei;
  	  bool all_set = true;
  
  	  bitmap_clear_bit (&bb_on_list, tmp_bb->index);
--- 5951,5956 ----
*************** thread_prologue_and_epilogue_insns (void
*** 5862,5889 ****
  		}
  	  }
  
!       /* Test whether the prologue is known to clobber any register
! 	 (other than FP or SP) which are live on the edge.  */
!       CLEAR_HARD_REG_BIT (prologue_clobbered, STACK_POINTER_REGNUM);
!       if (frame_pointer_needed)
! 	CLEAR_HARD_REG_BIT (prologue_clobbered, HARD_FRAME_POINTER_REGNUM);
!       CLEAR_HARD_REG_SET (live_on_edge);
!       reg_set_to_hard_reg_set (&live_on_edge,
! 			       df_get_live_in (entry_edge->dest));
!       if (hard_reg_set_intersect_p (live_on_edge, prologue_clobbered))
  	{
! 	  entry_edge = orig_entry_edge;
! 	  if (dump_file)
! 	    fprintf (dump_file, "Shrink-wrapping aborted due to clobber.\n");
  	}
!       else if (entry_edge != orig_entry_edge)
  	{
  	  crtl->shrink_wrapped = true;
  	  if (dump_file)
  	    fprintf (dump_file, "Performing shrink-wrapping.\n");
  	}
  
      fail_shrinkwrap:
        bitmap_clear (&bb_antic_flags);
        bitmap_clear (&bb_on_list);
        VEC_free (basic_block, heap, vec);
--- 5995,6132 ----
  		}
  	  }
  
!       if (entry_edge != orig_entry_edge)
  	{
! 	  /* Test whether the prologue is known to clobber any register
! 	     (other than FP or SP) which are live on the edge.  */
! 	  CLEAR_HARD_REG_BIT (prologue_clobbered, STACK_POINTER_REGNUM);
! 	  if (frame_pointer_needed)
! 	    CLEAR_HARD_REG_BIT (prologue_clobbered, HARD_FRAME_POINTER_REGNUM);
! 	  CLEAR_HARD_REG_SET (live_on_edge);
! 	  reg_set_to_hard_reg_set (&live_on_edge,
! 				   df_get_live_in (entry_edge->dest));
! 	  if (hard_reg_set_intersect_p (live_on_edge, prologue_clobbered))
! 	    {
! 	      entry_edge = orig_entry_edge;
! 	      if (dump_file)
! 		fprintf (dump_file,
! 			 "Shrink-wrapping aborted due to clobber.\n");
! 	    }
  	}
!       if (entry_edge != orig_entry_edge)
  	{
  	  crtl->shrink_wrapped = true;
  	  if (dump_file)
  	    fprintf (dump_file, "Performing shrink-wrapping.\n");
+ 
+ 	  /* Find tail blocks reachable from both blocks needing a
+ 	     prologue and blocks not needing a prologue.  */
+ 	  if (!bitmap_empty_p (&bb_tail))
+ 	    FOR_EACH_BB (bb)
+ 	      {
+ 		bool some_pro, some_no_pro;
+ 		if (!bitmap_bit_p (&bb_tail, bb->index))
+ 		  continue;
+ 		some_pro = some_no_pro = false;
+ 		FOR_EACH_EDGE (e, ei, bb->preds)
+ 		  {
+ 		    if (bitmap_bit_p (&bb_flags, e->src->index))
+ 		      some_pro = true;
+ 		    else
+ 		      some_no_pro = true;
+ 		  }
+ 		if (some_pro && some_no_pro)
+ 		  VEC_quick_push (basic_block, vec, bb);
+ 		else
+ 		  bitmap_clear_bit (&bb_tail, bb->index);
+ 	      }
+ 	  /* Find the head of each tail.  */
+ 	  while (!VEC_empty (basic_block, vec))
+ 	    {
+ 	      basic_block tbb = VEC_pop (basic_block, vec);
+ 
+ 	      if (!bitmap_bit_p (&bb_tail, tbb->index))
+ 		continue;
+ 
+ 	      while (single_succ_p (tbb))
+ 		{
+ 		  tbb = single_succ (tbb);
+ 		  bitmap_clear_bit (&bb_tail, tbb->index);
+ 		}
+ 	    }
+ 	  /* Now duplicate the tails.  */
+ 	  if (!bitmap_empty_p (&bb_tail))
+ 	    FOR_EACH_BB_REVERSE (bb)
+ 	      {
+ 		basic_block copy_bb, next_bb;
+ 		rtx insert_point;
+ 
+ 		if (!bitmap_clear_bit (&bb_tail, bb->index))
+ 		  continue;
+ 
+ 		next_bb = single_succ (bb);
+ 
+ 		/* Create a copy of BB, instructions and all, for
+ 		   use on paths that don't need a prologue.
+ 		   Ideal placement of the copy is on a fall-thru edge
+ 		   or after a block that would jump to the copy.  */ 
+ 		FOR_EACH_EDGE (e, ei, bb->preds)
+ 		  if (!bitmap_bit_p (&bb_flags, e->src->index)
+ 		      && single_succ_p (e->src))
+ 		    break;
+ 		if (e)
+ 		  {
+ 		    copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
+ 						  NULL_RTX, e->src);
+ 		    BB_COPY_PARTITION (copy_bb, e->src);
+ 		  }
+ 		else
+ 		  {
+ 		    /* Otherwise put the copy at the end of the function.  */
+ 		    copy_bb = create_basic_block (NULL_RTX, NULL_RTX,
+ 						  EXIT_BLOCK_PTR->prev_bb);
+ 		    BB_COPY_PARTITION (copy_bb, bb);
+ 		  }
+ 
+ 		insert_point = emit_note_after (NOTE_INSN_DELETED,
+ 						BB_END (copy_bb));
+ 		emit_barrier_after (BB_END (copy_bb));
+ 
+ 		make_single_succ_edge (copy_bb, EXIT_BLOCK_PTR, 0);
+ 
+ 		dup_block_and_redirect (bb, copy_bb, insert_point,
+ 					&bb_flags);
+ 
+ 		while (next_bb != EXIT_BLOCK_PTR)
+ 		  {
+ 		    basic_block tbb = next_bb;
+ 		    next_bb = single_succ (tbb);
+ 		    e = split_block (copy_bb, PREV_INSN (insert_point));
+ 		    copy_bb = e->dest;
+ 		    dup_block_and_redirect (tbb, copy_bb, insert_point,
+ 					    &bb_flags);
+ 		  }
+ 
+ 		/* Fiddle with edge flags to quiet verify_flow_info.
+ 		   We have yet to add a simple_return to the tails,
+ 		   as we'd like to first convert_jumps_to_returns in
+ 		   case the block is no longer used after that.  */
+ 		e = single_succ_edge (copy_bb);
+ 		if (CALL_P (PREV_INSN (insert_point))
+ 		    && SIBLING_CALL_P (PREV_INSN (insert_point)))
+ 		  e->flags = EDGE_SIBCALL | EDGE_ABNORMAL;
+ 		else
+ 		  e->flags = EDGE_FAKE;
+ 		/* verify_flow_info doesn't like a note after a
+ 		   sibling call.  */
+ 		delete_insn (insert_point);
+ 		if (bitmap_empty_p (&bb_tail))
+ 		  break;
+ 	      }
  	}
  
      fail_shrinkwrap:
+       bitmap_clear (&bb_tail);
        bitmap_clear (&bb_antic_flags);
        bitmap_clear (&bb_on_list);
        VEC_free (basic_block, heap, vec);
*************** thread_prologue_and_epilogue_insns (void
*** 5911,6057 ****
  
    rtl_profile_for_bb (EXIT_BLOCK_PTR);
  
! #ifdef HAVE_return
    /* If we're allowed to generate a simple return instruction, then by
       definition we don't need a full epilogue.  If the last basic
       block before the exit block does not contain active instructions,
       examine its predecessors and try to emit (conditional) return
       instructions.  */
!   if (optimize && !last_bb_active
!       && (HAVE_return || entry_edge != orig_entry_edge))
      {
!       edge_iterator ei2;
!       int i;
!       basic_block bb;
!       rtx label;
!       VEC(basic_block,heap) *src_bbs;
! 
!       if (exit_fallthru_edge == NULL)
! 	goto epilogue_done;
!       label = BB_HEAD (last_bb);
! 
!       src_bbs = VEC_alloc (basic_block, heap, EDGE_COUNT (last_bb->preds));
!       FOR_EACH_EDGE (e, ei2, last_bb->preds)
! 	if (e->src != ENTRY_BLOCK_PTR)
! 	  VEC_quick_push (basic_block, src_bbs, e->src);
! 
!       FOR_EACH_VEC_ELT (basic_block, src_bbs, i, bb)
  	{
! 	  bool simple_p;
! 	  rtx jump;
! 	  e = find_edge (bb, last_bb);
  
! 	  jump = BB_END (bb);
! 
! #ifdef HAVE_simple_return
! 	  simple_p = (entry_edge != orig_entry_edge
! 		      && !bitmap_bit_p (&bb_flags, bb->index));
! #else
! 	  simple_p = false;
! #endif
! 
! 	  if (!simple_p
! 	      && (!HAVE_return || !JUMP_P (jump)
! 		  || JUMP_LABEL (jump) != label))
! 	    continue;
! 
! 	  /* If we have an unconditional jump, we can replace that
! 	     with a simple return instruction.  */
! 	  if (!JUMP_P (jump))
! 	    {
! 	      emit_barrier_after (BB_END (bb));
! 	      emit_return_into_block (simple_p, bb);
! 	    }
! 	  else if (simplejump_p (jump))
  	    {
! 	      /* The use of the return register might be present in the exit
! 	         fallthru block.  Either:
! 	         - removing the use is safe, and we should remove the use in
! 	           the exit fallthru block, or
! 	         - removing the use is not safe, and we should add it here.
! 	         For now, we conservatively choose the latter.  Either of the
! 	         2 helps in crossjumping.  */
! 	      emit_use_return_register_into_block (bb);
! 
! 	      emit_return_into_block (simple_p, bb);
! 	      delete_insn (jump);
  	    }
! 	  else if (condjump_p (jump) && JUMP_LABEL (jump) != label)
! 	    {
! 	      basic_block new_bb;
! 	      edge new_e;
  
! 	      gcc_assert (simple_p);
! 	      new_bb = split_edge (e);
! 	      emit_barrier_after (BB_END (new_bb));
! 	      emit_return_into_block (simple_p, new_bb);
! #ifdef HAVE_simple_return
! 	      if (BB_PARTITION (new_bb) == BB_HOT_PARTITION)
! 		simple_return_block_hot = new_bb;
! 	      else
! 		simple_return_block_cold = new_bb;
  #endif
! 	      new_e = single_succ_edge (new_bb);
! 	      redirect_edge_succ (new_e, EXIT_BLOCK_PTR);
  
! 	      continue;
! 	    }
! 	  /* If we have a conditional jump branching to the last
! 	     block, we can try to replace that with a conditional
! 	     return instruction.  */
! 	  else if (condjump_p (jump))
! 	    {
! 	      rtx dest;
! 	      if (simple_p)
! 		dest = simple_return_rtx;
! 	      else
! 		dest = ret_rtx;
! 	      if (! redirect_jump (jump, dest, 0))
! 		{
! #ifdef HAVE_simple_return
! 		  if (simple_p)
! 		    VEC_safe_push (rtx, heap,
! 				   unconverted_simple_returns, jump);
! #endif
! 		  continue;
! 		}
  
! 	      /* See comment in simple_jump_p case above.  */
! 	      emit_use_return_register_into_block (bb);
  
! 	      /* If this block has only one successor, it both jumps
! 		 and falls through to the fallthru block, so we can't
! 		 delete the edge.  */
! 	      if (single_succ_p (bb))
! 		continue;
! 	    }
! 	  else
  	    {
  #ifdef HAVE_simple_return
! 	      if (simple_p)
! 		VEC_safe_push (rtx, heap,
! 			       unconverted_simple_returns, jump);
  #endif
! 	      continue;
  	    }
- 
- 	  /* Fix up the CFG for the successful change we just made.  */
- 	  redirect_edge_succ (e, EXIT_BLOCK_PTR);
- 	}
-       VEC_free (basic_block, heap, src_bbs);
- 
-       if (HAVE_return)
- 	{
- 	  /* Emit a return insn for the exit fallthru block.  Whether
- 	     this is still reachable will be determined later.  */
- 
- 	  emit_barrier_after (BB_END (last_bb));
- 	  emit_return_into_block (false, last_bb);
- 	  epilogue_end = BB_END (last_bb);
- 	  if (JUMP_P (epilogue_end))
- 	    set_return_jump_label (epilogue_end);
- 	  single_succ_edge (last_bb)->flags &= ~EDGE_FALLTHRU;
- 	  goto epilogue_done;
  	}
      }
  #endif
--- 6154,6226 ----
  
    rtl_profile_for_bb (EXIT_BLOCK_PTR);
  
!   exit_fallthru_edge = find_fallthru_edge (EXIT_BLOCK_PTR->preds);
! 
    /* If we're allowed to generate a simple return instruction, then by
       definition we don't need a full epilogue.  If the last basic
       block before the exit block does not contain active instructions,
       examine its predecessors and try to emit (conditional) return
       instructions.  */
! #ifdef HAVE_simple_return
!   if (entry_edge != orig_entry_edge)
      {
!       if (optimize)
  	{
! 	  unsigned i, last;
  
! 	  /* convert_jumps_to_returns may add to EXIT_BLOCK_PTR->preds
! 	     (but won't remove).  Stop at end of current preds.  */
! 	  last = EDGE_COUNT (EXIT_BLOCK_PTR->preds);
! 	  for (i = 0; i < last; i++)
  	    {
! 	      e = EDGE_I (EXIT_BLOCK_PTR->preds, i);
! 	      if (LABEL_P (BB_HEAD (e->src))
! 		  && !bitmap_bit_p (&bb_flags, e->src->index)
! 		  && !active_insn_between (BB_HEAD (e->src), BB_END (e->src)))
! 		unconverted_simple_returns
! 		  = convert_jumps_to_returns (e->src, true,
! 					      unconverted_simple_returns);
  	    }
! 	}
  
!       if (exit_fallthru_edge != NULL
! 	  && EDGE_COUNT (exit_fallthru_edge->src->preds) != 0
! 	  && !bitmap_bit_p (&bb_flags, exit_fallthru_edge->src->index))
! 	{
! 	  basic_block last_bb;
! 
! 	  last_bb = emit_return_for_exit (exit_fallthru_edge, true);
! 	  returnjump = BB_END (last_bb);
! 	  exit_fallthru_edge = NULL;
! 	}
!     }
  #endif
! #ifdef HAVE_return
!   if (HAVE_return)
!     {
!       if (exit_fallthru_edge == NULL)
! 	goto epilogue_done;
  
!       if (optimize)
! 	{
! 	  basic_block last_bb = exit_fallthru_edge->src;
  
! 	  if (LABEL_P (BB_HEAD (last_bb))
! 	      && !active_insn_between (BB_HEAD (last_bb), BB_END (last_bb)))
! 	    convert_jumps_to_returns (last_bb, false, NULL);
  
! 	  if (EDGE_COUNT (exit_fallthru_edge->src->preds) != 0)
  	    {
+ 	      last_bb = emit_return_for_exit (exit_fallthru_edge, false);
+ 	      epilogue_end = returnjump = BB_END (last_bb);
  #ifdef HAVE_simple_return
! 	      /* Emitting the return may add a basic block.
! 		 Fix bb_flags for the added block.  */
! 	      if (last_bb != exit_fallthru_edge->src)
! 		bitmap_set_bit (&bb_flags, last_bb->index);
  #endif
! 	      goto epilogue_done;
  	    }
  	}
      }
  #endif
*************** epilogue_done:
*** 6171,6180 ****
       convert to conditional simple_returns, but couldn't for some
       reason, create a block to hold a simple_return insn and redirect
       those remaining edges.  */
!   if (!VEC_empty (rtx, unconverted_simple_returns))
      {
        basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb;
-       rtx jump;
        int i;
  
        gcc_assert (entry_edge != orig_entry_edge);
--- 6340,6352 ----
       convert to conditional simple_returns, but couldn't for some
       reason, create a block to hold a simple_return insn and redirect
       those remaining edges.  */
!   if (!VEC_empty (edge, unconverted_simple_returns))
      {
+       basic_block simple_return_block_hot = NULL;
+       basic_block simple_return_block_cold = NULL;
+       edge pending_edge_hot = NULL;
+       edge pending_edge_cold = NULL;
        basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb;
        int i;
  
        gcc_assert (entry_edge != orig_entry_edge);
*************** epilogue_done:
*** 6184,6208 ****
        if (returnjump != NULL_RTX
  	  && JUMP_LABEL (returnjump) == simple_return_rtx)
  	{
! 	  edge e = split_block (exit_fallthru_edge->src,
! 				PREV_INSN (returnjump));
  	  if (BB_PARTITION (e->src) == BB_HOT_PARTITION)
  	    simple_return_block_hot = e->dest;
  	  else
  	    simple_return_block_cold = e->dest;
  	}
  
!       FOR_EACH_VEC_ELT (rtx, unconverted_simple_returns, i, jump)
  	{
- 	  basic_block src_bb = BLOCK_FOR_INSN (jump);
- 	  edge e = find_edge (src_bb, last_bb);
  	  basic_block *pdest_bb;
  
! 	  if (BB_PARTITION (src_bb) == BB_HOT_PARTITION)
! 	    pdest_bb = &simple_return_block_hot;
  	  else
! 	    pdest_bb = &simple_return_block_cold;
! 	  if (*pdest_bb == NULL)
  	    {
  	      basic_block bb;
  	      rtx start;
--- 6356,6403 ----
        if (returnjump != NULL_RTX
  	  && JUMP_LABEL (returnjump) == simple_return_rtx)
  	{
! 	  e = split_block (BLOCK_FOR_INSN (returnjump), PREV_INSN (returnjump));
  	  if (BB_PARTITION (e->src) == BB_HOT_PARTITION)
  	    simple_return_block_hot = e->dest;
  	  else
  	    simple_return_block_cold = e->dest;
  	}
  
!       /* Also check returns we might need to add to tail blocks.  */
!       FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR->preds)
! 	if (EDGE_COUNT (e->src->preds) != 0
! 	    && (e->flags & EDGE_FAKE) != 0
! 	    && !bitmap_bit_p (&bb_flags, e->src->index))
! 	  {
! 	    if (BB_PARTITION (e->src) == BB_HOT_PARTITION)
! 	      pending_edge_hot = e;
! 	    else
! 	      pending_edge_cold = e;
! 	  }
! 
!       FOR_EACH_VEC_ELT (edge, unconverted_simple_returns, i, e)
  	{
  	  basic_block *pdest_bb;
+ 	  edge pending;
  
! 	  if (BB_PARTITION (e->src) == BB_HOT_PARTITION)
! 	    {
! 	      pdest_bb = &simple_return_block_hot;
! 	      pending = pending_edge_hot;
! 	    }
  	  else
! 	    {
! 	      pdest_bb = &simple_return_block_cold;
! 	      pending = pending_edge_cold;
! 	    }
! 
! 	  if (*pdest_bb == NULL && pending != NULL)
! 	    {
! 	      emit_return_into_block (true, pending->src);
! 	      pending->flags &= ~(EDGE_FALLTHRU | EDGE_FAKE);
! 	      *pdest_bb = pending->src;
! 	    }
! 	  else if (*pdest_bb == NULL)
  	    {
  	      basic_block bb;
  	      rtx start;
*************** epilogue_done:
*** 6219,6225 ****
  	    }
  	  redirect_edge_and_branch_force (e, *pdest_bb);
  	}
!       VEC_free (rtx, heap, unconverted_simple_returns);
      }
  #endif
  
--- 6414,6432 ----
  	    }
  	  redirect_edge_and_branch_force (e, *pdest_bb);
  	}
!       VEC_free (edge, heap, unconverted_simple_returns);
!     }
! 
!   if (entry_edge != orig_entry_edge)
!     {
!       FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR->preds)
! 	if (EDGE_COUNT (e->src->preds) != 0
! 	    && (e->flags & EDGE_FAKE) != 0
! 	    && !bitmap_bit_p (&bb_flags, e->src->index))
! 	  {
! 	    emit_return_into_block (true, e->src);
! 	    e->flags &= ~(EDGE_FALLTHRU | EDGE_FAKE);
! 	  }
      }
  #endif
  
*************** epilogue_done:
*** 6233,6240 ****
  
        if (!CALL_P (insn)
  	  || ! SIBLING_CALL_P (insn)
  	  || (entry_edge != orig_entry_edge
! 	      && !bitmap_bit_p (&bb_flags, bb->index)))
  	{
  	  ei_next (&ei);
  	  continue;
--- 6440,6450 ----
  
        if (!CALL_P (insn)
  	  || ! SIBLING_CALL_P (insn)
+ #ifdef HAVE_simple_return
  	  || (entry_edge != orig_entry_edge
! 	      && !bitmap_bit_p (&bb_flags, bb->index))
! #endif
! 	  )
  	{
  	  ei_next (&ei);
  	  continue;
*************** epilogue_done:
*** 6281,6287 ****
--- 6491,6499 ----
      }
  #endif
  
+ #ifdef HAVE_simple_return
    bitmap_clear (&bb_flags);
+ #endif
  
    /* Threading the prologue and epilogue changes the artificial refs
       in the entry and exit blocks.  */

-- 
Alan Modra
Australia Development Lab, IBM

  parent reply	other threads:[~2011-10-31 14:27 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-17  7:59 PowerPC shrink-wrap support 0 " Alan Modra
2011-09-17  8:22 ` PowerPC shrink-wrap support 1 " Alan Modra
2011-09-17  9:13 ` PowerPC shrink-wrap support 2 " Alan Modra
2011-09-17  9:26 ` PowerPC shrink-wrap support 3 " Alan Modra
2011-09-26 14:25   ` Alan Modra
2011-09-27  0:15     ` Alan Modra
2011-09-27  0:19       ` Bernd Schmidt
2011-09-27  0:49         ` Alan Modra
2011-09-27  1:08           ` Bernd Schmidt
2011-09-27  2:16             ` Alan Modra
2011-09-28 16:35               ` Alan Modra
2011-10-16 20:19                 ` David Edelsohn
2011-10-26 13:03                   ` Alan Modra
2011-10-26 13:42                     ` Bernd Schmidt
2011-10-26 14:40                       ` Alan Modra
2011-10-26 14:44                         ` Bernd Schmidt
2011-10-26 15:40                           ` Alan Modra
2011-10-28  0:41                         ` Alan Modra
2011-10-31 15:14                     ` Alan Modra [this message]
2011-11-01 15:34                       ` Alan Modra
2011-11-07 17:27                         ` Jakub Jelinek
2011-11-09  9:48                         ` Hans-Peter Nilsson
2011-11-10 11:25                           ` Revert "PowerPC shrink-wrap support 3 of 3" Hans-Peter Nilsson
2011-11-10 12:10                             ` Richard Guenther
2011-11-10 13:29                               ` Hans-Peter Nilsson
2011-11-10 13:44                                 ` Richard Guenther
2011-11-10 14:13                                   ` Bernd Schmidt
2011-11-10 15:23                                     ` Hans-Peter Nilsson
2011-11-10 18:06                                       ` Hans-Peter Nilsson
2011-11-11 22:09                                         ` Hans-Peter Nilsson
2011-11-14 14:59                                           ` Bernd Schmidt
2011-11-14 16:49                                             ` CFG review needed for fix of " Hans-Peter Nilsson
2011-11-14 17:06                                               ` Ramana Radhakrishnan
2011-11-14 17:15                                                 ` Rainer Orth
2011-11-14 18:21                                               ` Richard Henderson
2011-11-14 22:44                                                 ` Alan Modra
2011-11-15  2:50                                                   ` Hans-Peter Nilsson
2011-11-15  5:03                                                   ` Richard Henderson
2011-11-15  6:11                                                     ` Bernd Schmidt
2011-11-15  7:09                                                       ` David Miller
2011-11-15 18:54                                                   ` Richard Henderson
2011-11-15  0:45                                                 ` Hans-Peter Nilsson
2011-11-11  0:22                                     ` Revert " Michael Meissner
2015-04-08 11:11                             ` Gerald Pfeifer
2011-09-17 18:16 ` PowerPC shrink-wrap support 0 of 3 Bernd Schmidt
2011-09-19  5:39   ` Alan Modra
2011-09-19 13:36     ` Bernd Schmidt
     [not found]       ` <20110921152851.GE10321@bubble.grove.modra.org>
     [not found]         ` <20110922144017.GF10321@bubble.grove.modra.org>
2011-09-26 14:35           ` [PATCH] PowerPC shrink-wrap support benchmark gains Alan Modra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111031142722.GX29439@bubble.grove.modra.org \
    --to=amodra@gmail.com \
    --cc=bernds@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).