public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Bernd Schmidt <bernds@codesourcery.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>, richard.sandiford@linaro.org
Subject: Re: [PATCH 4/6] Shrink-wrapping
Date: Wed, 24 Aug 2011 19:23:00 -0000	[thread overview]
Message-ID: <4E552ACB.8050702@codesourcery.com> (raw)
In-Reply-To: <g4vcue8pyu.fsf@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 5261 bytes --]

On 08/03/11 17:38, Richard Sandiford wrote:
> Bernd Schmidt <bernds@codesourcery.com> writes:
>> +@findex simple_return
>> +@item (simple_return)
>> +Like @code{(return)}, but truly represents only a function return, while
>> +@code{(return)} may represent an insn that also performs other functions
>> +of the function epilogue.  Like @code{(return)}, this may also occur in
>> +conditional jumps.
> 
> Sorry, I've forgotton the outcome of the discussion about what happens
> on targets whose return expands to the same code as their simple_return.
> Do the targets still need both "return" and "simple_return" rtxes?

It's important to distinguish between these names as rtxes that can
occur in instruction patterns, and a use as a standard pattern name.
When a "return" pattern is generated, it should either fail or expand to
something that performs both the epilogue and the return. A
"simple_return" expands to something that performs only the return.

Most targets allow "return" patterns only if the epilogue is empty. In
that case, "return" and "simple_return" can expand to the same insn; it
does not matter if that insn uses "simple_return" or "return", as they
are equivalent in the absence of an epilogue. It would be slightly nicer
to use "simple_return" in the patterns everywhere except ARM, but ports
don't need to be changed.

> Do they need both md patterns (but potentially using the same rtx
> underneath)?

The "return" standard pattern is needed for the existing optimizations
(inserting returns in-line rather than jumping to the end of the
function). Typically, it always fails if the function needs an epilogue,
except in the ARM case.
For shrink-wrapping to work, a port needs a "simple_return" pattern,
which the compiler can use even if parts of the function need an
epilogue. So yes, they have different purposes.

> I ask because the rtl.def comment implies that those targets still
> need both expanders and both rtxes.  If that's so, I think it needs
> to be mentioned here too.  E.g. something like:
> 
>   Like @code{(return)}, but truly represents only a function return, while
>   @code{(return)} may represent an insn that also performs other functions
>   of the function epilogue.  @code{(return)} only occurs on paths that
>   pass through the function prologue, while @code{(simple_return)}
>   only occurs on paths that do not pass through the prologue.

This is not accurate for the rtx code. It is mostly accurate for the
standard pattern name. A simple_return rtx may occur just after an
epilogue, i.e. on a path that has passed through the prologue.

Even for the simple_return pattern, I'm not sure reorg.c couldn't
introduce new expansions in a location after both prologue and epilogue.

>   Like @code{(return)}, @code{(simple_return)} may also occur in
>   conditional jumps.
> 
> You need to document the simple_return pattern in md.texi too.

I was trying to update the documentation to only the current state after
the patch. The thinking was that without shrink-wrapping, nothing
generates this pattern, so documenting it would be misleading.
However, with the mips changes in this version of the patch, reorg.c
does make use of this pattern, so I've added documentation

>> @@ -3498,6 +3506,8 @@ relax_delay_slots (rtx first)
>>  	continue;
>>  
>>        target_label = JUMP_LABEL (delay_insn);
>> +      if (target_label && ANY_RETURN_P (target_label))
>> +	continue;
>>  
>>        if (!ANY_RETURN_P (target_label))
>>  	{
> 
> This doesn't look like a pure "handle return as well as simple return"
> change.  Is the idea that every following test only makes sense for
> labels, and that things like:
> 
> 	  && prev_active_insn (target_label) == insn
> 
> (to pick just one example) are actively dangerous for returns?

That probably was the idea. Looking at it again, there's one case at the
bottom of the loop which may be safe, but given that there were no code
generation differences with the patch on three targets with
define_delay, I've done:

> If so, I think you should remove the immediately-following.
> "if (!ANY_RETURN_P (target_label))" condition and reindent the body.

this.

> Given what you said about JUMP_LABEL sometimes being null,
> I think we need either (a) to check whether each *_return_label
> is null before comparing it with JUMP_LABEL, or (b) to ensure that
> we're dealing with a jump to a label.  (b) seems neater IMO
> (as a call to jump_to_label_p).

Done.

> 
>> +#if defined HAVE_return || defined HAVE_simple_return
>> +  if (
>>  #ifdef HAVE_return
>> -  if (HAVE_return && end_of_function_label != 0)
>> +      (HAVE_return && function_return_label != 0)
>> +#else
>> +      0
>> +#endif
>> +#ifdef HAVE_simple_return
>> +      || (HAVE_simple_return && function_simple_return_label != 0)
>> +#endif
>> +      )
>>      make_return_insns (first);
>>  #endif
> 
> Eww.

Restructured.

> (define_code_iterator any_return [return simple_return])
> 
> and just change the appropriate returns to any_returns.

I've done this a bit differently - to show that it can be done, I've
changed mips to always emit simple_return rtxs, even for "return"
patterns (no code generation changes observed again).

This version regression tested on mips64-elf, c/c++/objc.


Bernd

[-- Attachment #2: sw-part2.diff --]
[-- Type: text/plain, Size: 53015 bytes --]

	* doc/rtl.texi (simple_return): Document.
	(parallel, PATTERN): Here too.
	* doc/md.texi (return): Mention it's allowed to expand to simple_return
	in some cases.
	(simple_return): Document standard pattern.
	* gengenrtl.c (special_rtx): SIMPLE_RETURN is special.
	* final.c (final_scan_insn): Use ANY_RETURN_P on body.
	* reorg.c (function_return_label, function_simple_return_label):
	New static variables, replacing...
	(end_of_function_label): ... this.
	(simplejump_or_return_p): New static function.
	(optimize_skip, steal_delay_list_from_fallthrough,
	fill_slots_from_thread): Use it.
	(relax_delay_slots): Likewise.  Use ANY_RETURN_P on body.
	(rare_destination, follow_jumps): Use ANY_RETURN_P on body.
	(find_end_label): Take a new arg which is one of the two return
	rtxs.  Depending on which, set either function_return_label or
	function_simple_return_label.  All callers changed.
	(make_return_insns): Make both kinds.
	(dbr_schedule): Adjust for two kinds of end labels.
	* genemit.c (gen_exp): Handle SIMPLE_RETURN.
	(gen_expand, gen_split): Use ANY_RETURN_P.
	* df-scan.c (df_uses_record): Handle SIMPLE_RETURN.
	* rtl.def (SIMPLE_RETURN): New code.
	* ifcvt.c (find_if_case_1): Be more careful about
	redirecting jumps to the EXIT_BLOCK.
	* jump.c (condjump_p, condjump_in_parallel_p, any_condjump_p,
	returnjump_p_1): Handle SIMPLE_RETURNs.
	* print-rtl.c (print_rtx): Likewise.
	* rtl.c (copy_rtx): Likewise.
	* bt-load.c (compute_defs_uses_and_gen): Use ANY_RETURN_P.
	* combine.c (simplify_set): Likewise.
	* resource.c (find_dead_or_set_registers, mark_set_resources):
	Likewise.
	* emit-rtl.c (verify_rtx_sharing, classify_insn, copy_insn_1,
	copy_rtx_if_shared_1, mark_used_flags): Handle SIMPLE_RETURNs.
	(init_emit_regs): Initialize simple_return_rtx.
	* cfglayout.c (fixup_reorder_chain): Pass a JUMP_LABEL to
	force_nonfallthru_and_redirect.
	* rtl.h (ANY_RETURN_P): Allow SIMPLE_RETURN.
	(GR_SIMPLE_RETURN): New enum value.
	(simple_return_rtx): New macro.
	* basic-block.h (force_nonfallthru_and_redirect): Adjust
	declaration.
	* cfgrtl.c (force_nonfallthru_and_redirect): Take a new jump_label
	argument.  All callers changed.  Be careful about what kinds of
	returnjumps to generate.
	* config/i386/3i86.c (ix86_pad_returns, ix86_count_insn_bb,
	ix86_pad_short_function): Likewise.
	* config/arm/arm.c (arm_final_prescan_insn): Handle both kinds
	of return.
	* config/mips/mips.md (any_return): New code_iterator.
	(optab): Add cases for return and simple_return.
	(return): Expand to a simple_return.
	(simple_return): New pattern.
	(*<optab>, *<optab>_internal for any_return): New patterns.
	(return_internal): Remove.
	* config/mips/mips.c (mips_expand_epilogue): Make the last insn
	a simple_return_internal.

Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi	(revision 177999)
+++ gcc/doc/rtl.texi	(working copy)
@@ -2915,6 +2915,13 @@ placed in @code{pc} to return to the cal
 Note that an insn pattern of @code{(return)} is logically equivalent to
 @code{(set (pc) (return))}, but the latter form is never used.
 
+@findex simple_return
+@item (simple_return)
+Like @code{(return)}, but truly represents only a function return, while
+@code{(return)} may represent an insn that also performs other functions
+of the function epilogue.  Like @code{(return)}, this may also occur in
+conditional jumps.
+
 @findex call
 @item (call @var{function} @var{nargs})
 Represents a function call.  @var{function} is a @code{mem} expression
@@ -3044,7 +3051,7 @@ Represents several side effects performe
 brackets stand for a vector; the operand of @code{parallel} is a
 vector of expressions.  @var{x0}, @var{x1} and so on are individual
 side effect expressions---expressions of code @code{set}, @code{call},
-@code{return}, @code{clobber} or @code{use}.
+@code{return}, @code{simple_return}, @code{clobber} or @code{use}.
 
 ``In parallel'' means that first all the values used in the individual
 side-effects are computed, and second all the actual side-effects are
@@ -3683,14 +3690,16 @@ and @code{call_insn} insns:
 @table @code
 @findex PATTERN
 @item PATTERN (@var{i})
-An expression for the side effect performed by this insn.  This must be
-one of the following codes: @code{set}, @code{call}, @code{use},
-@code{clobber}, @code{return}, @code{asm_input}, @code{asm_output},
-@code{addr_vec}, @code{addr_diff_vec}, @code{trap_if}, @code{unspec},
-@code{unspec_volatile}, @code{parallel}, @code{cond_exec}, or @code{sequence}.  If it is a @code{parallel},
-each element of the @code{parallel} must be one these codes, except that
-@code{parallel} expressions cannot be nested and @code{addr_vec} and
-@code{addr_diff_vec} are not permitted inside a @code{parallel} expression.
+An expression for the side effect performed by this insn.  This must
+be one of the following codes: @code{set}, @code{call}, @code{use},
+@code{clobber}, @code{return}, @code{simple_return}, @code{asm_input},
+@code{asm_output}, @code{addr_vec}, @code{addr_diff_vec},
+@code{trap_if}, @code{unspec}, @code{unspec_volatile},
+@code{parallel}, @code{cond_exec}, or @code{sequence}.  If it is a
+@code{parallel}, each element of the @code{parallel} must be one these
+codes, except that @code{parallel} expressions cannot be nested and
+@code{addr_vec} and @code{addr_diff_vec} are not permitted inside a
+@code{parallel} expression.
 
 @findex INSN_CODE
 @item INSN_CODE (@var{i})
Index: gcc/gengenrtl.c
===================================================================
--- gcc/gengenrtl.c	(revision 177999)
+++ gcc/gengenrtl.c	(working copy)
@@ -131,6 +131,7 @@ special_rtx (int idx)
 	  || strcmp (defs[idx].enumname, "PC") == 0
 	  || strcmp (defs[idx].enumname, "CC0") == 0
 	  || strcmp (defs[idx].enumname, "RETURN") == 0
+	  || strcmp (defs[idx].enumname, "SIMPLE_RETURN") == 0
 	  || strcmp (defs[idx].enumname, "CONST_VECTOR") == 0);
 }
 
Index: gcc/final.c
===================================================================
--- gcc/final.c	(revision 177999)
+++ gcc/final.c	(working copy)
@@ -2492,7 +2492,7 @@ final_scan_insn (rtx insn, FILE *file, i
 	        delete_insn (insn);
 		break;
 	      }
-	    else if (GET_CODE (SET_SRC (body)) == RETURN)
+	    else if (ANY_RETURN_P (SET_SRC (body)))
 	      /* Replace (set (pc) (return)) with (return).  */
 	      PATTERN (insn) = body = SET_SRC (body);
 
Index: gcc/reorg.c
===================================================================
--- gcc/reorg.c	(revision 177999)
+++ gcc/reorg.c	(working copy)
@@ -161,8 +161,11 @@ static rtx *unfilled_firstobj;
 #define unfilled_slots_next	\
   ((rtx *) obstack_next_free (&unfilled_slots_obstack))
 
-/* Points to the label before the end of the function.  */
-static rtx end_of_function_label;
+/* Points to the label before the end of the function, or before a
+   return insn.  */
+static rtx function_return_label;
+/* Likewise for a simple_return.  */
+static rtx function_simple_return_label;
 
 /* Mapping between INSN_UID's and position in the code since INSN_UID's do
    not always monotonically increase.  */
@@ -175,7 +178,7 @@ static int stop_search_p (rtx, int);
 static int resource_conflicts_p (struct resources *, struct resources *);
 static int insn_references_resource_p (rtx, struct resources *, bool);
 static int insn_sets_resource_p (rtx, struct resources *, bool);
-static rtx find_end_label (void);
+static rtx find_end_label (rtx);
 static rtx emit_delay_sequence (rtx, rtx, int);
 static rtx add_to_delay_list (rtx, rtx);
 static rtx delete_from_delay_slot (rtx);
@@ -231,6 +234,15 @@ first_active_target_insn (rtx insn)
   return next_active_insn (insn);
 }
 \f
+/* Return true iff INSN is a simplejump, or any kind of return insn.  */
+
+static bool
+simplejump_or_return_p (rtx insn)
+{
+  return (JUMP_P (insn)
+	  && (simplejump_p (insn) || ANY_RETURN_P (PATTERN (insn))));
+}
+\f
 /* Return TRUE if this insn should stop the search for insn to fill delay
    slots.  LABELS_P indicates that labels should terminate the search.
    In all cases, jumps terminate the search.  */
@@ -346,23 +358,34 @@ insn_sets_resource_p (rtx insn, struct r
 
    ??? There may be a problem with the current implementation.  Suppose
    we start with a bare RETURN insn and call find_end_label.  It may set
-   end_of_function_label just before the RETURN.  Suppose the machinery
+   function_return_label just before the RETURN.  Suppose the machinery
    is able to fill the delay slot of the RETURN insn afterwards.  Then
-   end_of_function_label is no longer valid according to the property
+   function_return_label is no longer valid according to the property
    described above and find_end_label will still return it unmodified.
    Note that this is probably mitigated by the following observation:
-   once end_of_function_label is made, it is very likely the target of
+   once function_return_label is made, it is very likely the target of
    a jump, so filling the delay slot of the RETURN will be much more
-   difficult.  */
+   difficult.
+   KIND is either simple_return_rtx or ret_rtx, indicating which type of
+   return we're looking for.  */
 
 static rtx
-find_end_label (void)
+find_end_label (rtx kind)
 {
   rtx insn;
+  rtx *plabel;
+
+  if (kind == ret_rtx)
+    plabel = &function_return_label;
+  else
+    {
+      gcc_assert (kind == simple_return_rtx);
+      plabel = &function_simple_return_label;
+    }
 
   /* If we found one previously, return it.  */
-  if (end_of_function_label)
-    return end_of_function_label;
+  if (*plabel)
+    return *plabel;
 
   /* Otherwise, see if there is a label at the end of the function.  If there
      is, it must be that RETURN insns aren't needed, so that is our return
@@ -377,44 +400,45 @@ find_end_label (void)
 
   /* When a target threads its epilogue we might already have a
      suitable return insn.  If so put a label before it for the
-     end_of_function_label.  */
+     function_return_label.  */
   if (BARRIER_P (insn)
       && JUMP_P (PREV_INSN (insn))
-      && GET_CODE (PATTERN (PREV_INSN (insn))) == RETURN)
+      && PATTERN (PREV_INSN (insn)) == kind)
     {
       rtx temp = PREV_INSN (PREV_INSN (insn));
-      end_of_function_label = gen_label_rtx ();
-      LABEL_NUSES (end_of_function_label) = 0;
+      rtx label = gen_label_rtx ();
+      LABEL_NUSES (label) = 0;
 
-      /* Put the label before an USE insns that may precede the RETURN insn.  */
+      /* Put the label before any USE insns that may precede the RETURN
+	 insn.  */
       while (GET_CODE (temp) == USE)
 	temp = PREV_INSN (temp);
 
-      emit_label_after (end_of_function_label, temp);
+      emit_label_after (label, temp);
+      *plabel = label;
     }
 
   else if (LABEL_P (insn))
-    end_of_function_label = insn;
+    *plabel = insn;
   else
     {
-      end_of_function_label = gen_label_rtx ();
-      LABEL_NUSES (end_of_function_label) = 0;
+      rtx label = gen_label_rtx ();
+      LABEL_NUSES (label) = 0;
       /* If the basic block reorder pass moves the return insn to
 	 some other place try to locate it again and put our
-	 end_of_function_label there.  */
-      while (insn && ! (JUMP_P (insn)
-		        && (GET_CODE (PATTERN (insn)) == RETURN)))
+	 function_return_label there.  */
+      while (insn && ! (JUMP_P (insn) && (PATTERN (insn) == kind)))
 	insn = PREV_INSN (insn);
       if (insn)
 	{
 	  insn = PREV_INSN (insn);
 
-	  /* Put the label before an USE insns that may proceed the
+	  /* Put the label before any USE insns that may precede the
 	     RETURN insn.  */
 	  while (GET_CODE (insn) == USE)
 	    insn = PREV_INSN (insn);
 
-	  emit_label_after (end_of_function_label, insn);
+	  emit_label_after (label, insn);
 	}
       else
 	{
@@ -424,19 +448,16 @@ find_end_label (void)
 	      && ! HAVE_return
 #endif
 	      )
-	    {
-	      /* The RETURN insn has its delay slot filled so we cannot
-		 emit the label just before it.  Since we already have
-		 an epilogue and cannot emit a new RETURN, we cannot
-		 emit the label at all.  */
-	      end_of_function_label = NULL_RTX;
-	      return end_of_function_label;
-	    }
+	    /* The RETURN insn has its delay slot filled so we cannot
+	       emit the label just before it.  Since we already have
+	       an epilogue and cannot emit a new RETURN, we cannot
+	       emit the label at all.  */
+	    return NULL_RTX;
 #endif /* HAVE_epilogue */
 
 	  /* Otherwise, make a new label and emit a RETURN and BARRIER,
 	     if needed.  */
-	  emit_label (end_of_function_label);
+	  emit_label (label);
 #ifdef HAVE_return
 	  /* We don't bother trying to create a return insn if the
 	     epilogue has filled delay-slots; we would have to try and
@@ -455,13 +476,14 @@ find_end_label (void)
 	    }
 #endif
 	}
+      *plabel = label;
     }
 
   /* Show one additional use for this label so it won't go away until
      we are done.  */
-  ++LABEL_NUSES (end_of_function_label);
+  ++LABEL_NUSES (*plabel);
 
-  return end_of_function_label;
+  return *plabel;
 }
 \f
 /* Put INSN and LIST together in a SEQUENCE rtx of LENGTH, and replace
@@ -809,10 +831,8 @@ optimize_skip (rtx insn)
   if ((next_trial == next_active_insn (JUMP_LABEL (insn))
        && ! (next_trial == 0 && crtl->epilogue_delay_list != 0))
       || (next_trial != 0
-	  && JUMP_P (next_trial)
-	  && JUMP_LABEL (insn) == JUMP_LABEL (next_trial)
-	  && (simplejump_p (next_trial)
-	      || GET_CODE (PATTERN (next_trial)) == RETURN)))
+	  && simplejump_or_return_p (next_trial)
+	  && JUMP_LABEL (insn) == JUMP_LABEL (next_trial)))
     {
       if (eligible_for_annul_false (insn, 0, trial, flags))
 	{
@@ -831,13 +851,11 @@ optimize_skip (rtx insn)
 	 branch, thread our jump to the target of that branch.  Don't
 	 change this into a RETURN here, because it may not accept what
 	 we have in the delay slot.  We'll fix this up later.  */
-      if (next_trial && JUMP_P (next_trial)
-	  && (simplejump_p (next_trial)
-	      || GET_CODE (PATTERN (next_trial)) == RETURN))
+      if (next_trial && simplejump_or_return_p (next_trial))
 	{
 	  rtx target_label = JUMP_LABEL (next_trial);
 	  if (ANY_RETURN_P (target_label))
-	    target_label = find_end_label ();
+	    target_label = find_end_label (target_label);
 
 	  if (target_label)
 	    {
@@ -951,7 +969,7 @@ rare_destination (rtx insn)
 	     return.  */
 	  return 2;
 	case JUMP_INSN:
-	  if (GET_CODE (PATTERN (insn)) == RETURN)
+	  if (ANY_RETURN_P (PATTERN (insn)))
 	    return 1;
 	  else if (simplejump_p (insn)
 		   && jump_count++ < 10)
@@ -1368,8 +1386,7 @@ steal_delay_list_from_fallthrough (rtx i
   /* We can't do anything if SEQ's delay insn isn't an
      unconditional branch.  */
 
-  if (! simplejump_p (XVECEXP (seq, 0, 0))
-      && GET_CODE (PATTERN (XVECEXP (seq, 0, 0))) != RETURN)
+  if (! simplejump_or_return_p (XVECEXP (seq, 0, 0)))
     return delay_list;
 
   for (i = 1; i < XVECLEN (seq, 0); i++)
@@ -2383,7 +2400,7 @@ fill_simple_delay_slots (int non_jumps_p
 	      if (new_label != 0)
 		new_label = get_label_before (new_label);
 	      else
-		new_label = find_end_label ();
+		new_label = find_end_label (simple_return_rtx);
 
 	      if (new_label)
 	        {
@@ -2515,7 +2532,8 @@ fill_simple_delay_slots (int non_jumps_p
 \f
 /* Follow any unconditional jump at LABEL;
    return the ultimate label reached by any such chain of jumps.
-   Return ret_rtx if the chain ultimately leads to a return instruction.
+   Return a suitable return rtx if the chain ultimately leads to a
+   return instruction.
    If LABEL is not followed by a jump, return LABEL.
    If the chain loops or we can't find end, return LABEL,
    since that tells caller to avoid changing the insn.  */
@@ -2536,7 +2554,7 @@ follow_jumps (rtx label)
 	&& JUMP_P (insn)
 	&& JUMP_LABEL (insn) != NULL_RTX
 	&& ((any_uncondjump_p (insn) && onlyjump_p (insn))
-	    || GET_CODE (PATTERN (insn)) == RETURN)
+	    || ANY_RETURN_P (PATTERN (insn)))
 	&& (next = NEXT_INSN (insn))
 	&& BARRIER_P (next));
        depth++)
@@ -3003,16 +3021,14 @@ fill_slots_from_thread (rtx insn, rtx co
 
       gcc_assert (thread_if_true);
 
-      if (new_thread && JUMP_P (new_thread)
-	  && (simplejump_p (new_thread)
-	      || GET_CODE (PATTERN (new_thread)) == RETURN)
+      if (new_thread && simplejump_or_return_p (new_thread)
 	  && redirect_with_delay_list_safe_p (insn,
 					      JUMP_LABEL (new_thread),
 					      delay_list))
 	new_thread = follow_jumps (JUMP_LABEL (new_thread));
 
       if (ANY_RETURN_P (new_thread))
-	label = find_end_label ();
+	label = find_end_label (new_thread);
       else if (LABEL_P (new_thread))
 	label = new_thread;
       else
@@ -3362,7 +3378,7 @@ relax_delay_slots (rtx first)
 	{
 	  target_label = skip_consecutive_labels (follow_jumps (target_label));
 	  if (ANY_RETURN_P (target_label))
-	    target_label = find_end_label ();
+	    target_label = find_end_label (target_label);
 
 	  if (target_label && next_active_insn (target_label) == next
 	      && ! condjump_in_parallel_p (insn))
@@ -3377,9 +3393,8 @@ relax_delay_slots (rtx first)
 	  /* See if this jump conditionally branches around an unconditional
 	     jump.  If so, invert this jump and point it to the target of the
 	     second jump.  */
-	  if (next && JUMP_P (next)
+	  if (next && simplejump_or_return_p (next)
 	      && any_condjump_p (insn)
-	      && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN)
 	      && target_label
 	      && next_active_insn (target_label) == next_active_insn (next)
 	      && no_labels_between_p (insn, next))
@@ -3421,8 +3436,7 @@ relax_delay_slots (rtx first)
 	 Don't do this if we expect the conditional branch to be true, because
 	 we would then be making the more common case longer.  */
 
-      if (JUMP_P (insn)
-	  && (simplejump_p (insn) || GET_CODE (PATTERN (insn)) == RETURN)
+      if (simplejump_or_return_p (insn)
 	  && (other = prev_active_insn (insn)) != 0
 	  && any_condjump_p (other)
 	  && no_labels_between_p (other, insn)
@@ -3463,10 +3477,10 @@ relax_delay_slots (rtx first)
 	 Only do so if optimizing for size since this results in slower, but
 	 smaller code.  */
       if (optimize_function_for_size_p (cfun)
-	  && GET_CODE (PATTERN (delay_insn)) == RETURN
+	  && ANY_RETURN_P (PATTERN (delay_insn))
 	  && next
 	  && JUMP_P (next)
-	  && GET_CODE (PATTERN (next)) == RETURN)
+	  && PATTERN (next) == PATTERN (delay_insn))
 	{
 	  rtx after;
 	  int i;
@@ -3505,73 +3519,71 @@ relax_delay_slots (rtx first)
 	continue;
 
       target_label = JUMP_LABEL (delay_insn);
+      if (target_label && ANY_RETURN_P (target_label))
+	continue;
 
-      if (!ANY_RETURN_P (target_label))
+      /* If this jump goes to another unconditional jump, thread it, but
+	 don't convert a jump into a RETURN here.  */
+      trial = skip_consecutive_labels (follow_jumps (target_label));
+      if (ANY_RETURN_P (trial))
+	trial = find_end_label (trial);
+
+      if (trial && trial != target_label
+	  && redirect_with_delay_slots_safe_p (delay_insn, trial, insn))
 	{
-	  /* If this jump goes to another unconditional jump, thread it, but
-	     don't convert a jump into a RETURN here.  */
-	  trial = skip_consecutive_labels (follow_jumps (target_label));
-	  if (ANY_RETURN_P (trial))
-	    trial = find_end_label ();
+	  reorg_redirect_jump (delay_insn, trial);
+	  target_label = trial;
+	}
 
-	  if (trial && trial != target_label
-	      && redirect_with_delay_slots_safe_p (delay_insn, trial, insn))
-	    {
-	      reorg_redirect_jump (delay_insn, trial);
-	      target_label = trial;
-	    }
+      /* If the first insn at TARGET_LABEL is redundant with a previous
+	 insn, redirect the jump to the following insn and process again.
+	 We use next_real_insn instead of next_active_insn so we
+	 don't skip USE-markers, or we'll end up with incorrect
+	 liveness info.  */
+      trial = next_real_insn (target_label);
+      if (trial && GET_CODE (PATTERN (trial)) != SEQUENCE
+	  && redundant_insn (trial, insn, 0)
+	  && ! can_throw_internal (trial))
+	{
+	  /* Figure out where to emit the special USE insn so we don't
+	     later incorrectly compute register live/death info.  */
+	  rtx tmp = next_active_insn (trial);
+	  if (tmp == 0)
+	    tmp = find_end_label (simple_return_rtx);
 
-	  /* If the first insn at TARGET_LABEL is redundant with a previous
-	     insn, redirect the jump to the following insn and process again.
-	     We use next_real_insn instead of next_active_insn so we
-	     don't skip USE-markers, or we'll end up with incorrect
-	     liveness info.  */
-	  trial = next_real_insn (target_label);
-	  if (trial && GET_CODE (PATTERN (trial)) != SEQUENCE
-	      && redundant_insn (trial, insn, 0)
-	      && ! can_throw_internal (trial))
+	  if (tmp)
 	    {
-	      /* Figure out where to emit the special USE insn so we don't
-		 later incorrectly compute register live/death info.  */
-	      rtx tmp = next_active_insn (trial);
-	      if (tmp == 0)
-		tmp = find_end_label ();
-
-	      if (tmp)
-	        {
-		  /* Insert the special USE insn and update dataflow info.  */
-		  update_block (trial, tmp);
-
-		  /* Now emit a label before the special USE insn, and
-		     redirect our jump to the new label.  */
-		  target_label = get_label_before (PREV_INSN (tmp));
-		  reorg_redirect_jump (delay_insn, target_label);
-		  next = insn;
-		  continue;
-		}
+	      /* Insert the special USE insn and update dataflow info.  */
+	      update_block (trial, tmp);
+	      
+	      /* Now emit a label before the special USE insn, and
+		 redirect our jump to the new label.  */
+	      target_label = get_label_before (PREV_INSN (tmp));
+	      reorg_redirect_jump (delay_insn, target_label);
+	      next = insn;
+	      continue;
 	    }
+	}
 
-	  /* Similarly, if it is an unconditional jump with one insn in its
-	     delay list and that insn is redundant, thread the jump.  */
-	  if (trial && GET_CODE (PATTERN (trial)) == SEQUENCE
-	      && XVECLEN (PATTERN (trial), 0) == 2
-	      && JUMP_P (XVECEXP (PATTERN (trial), 0, 0))
-	      && (simplejump_p (XVECEXP (PATTERN (trial), 0, 0))
-		  || GET_CODE (PATTERN (XVECEXP (PATTERN (trial), 0, 0))) == RETURN)
-	      && redundant_insn (XVECEXP (PATTERN (trial), 0, 1), insn, 0))
+      /* Similarly, if it is an unconditional jump with one insn in its
+	 delay list and that insn is redundant, thread the jump.  */
+      if (trial && GET_CODE (PATTERN (trial)) == SEQUENCE
+	  && XVECLEN (PATTERN (trial), 0) == 2
+	  && JUMP_P (XVECEXP (PATTERN (trial), 0, 0))
+	  && simplejump_or_return_p (XVECEXP (PATTERN (trial), 0, 0))
+	  && redundant_insn (XVECEXP (PATTERN (trial), 0, 1), insn, 0))
+	{
+	  target_label = JUMP_LABEL (XVECEXP (PATTERN (trial), 0, 0));
+	  if (ANY_RETURN_P (target_label))
+	    target_label = find_end_label (target_label);
+	  
+	  if (target_label
+	      && redirect_with_delay_slots_safe_p (delay_insn, target_label,
+						   insn))
 	    {
-	      target_label = JUMP_LABEL (XVECEXP (PATTERN (trial), 0, 0));
-	      if (ANY_RETURN_P (target_label))
-		target_label = find_end_label ();
-
-	      if (target_label
-	          && redirect_with_delay_slots_safe_p (delay_insn, target_label,
-						       insn))
-		{
-		  reorg_redirect_jump (delay_insn, target_label);
-		  next = insn;
-		  continue;
-		}
+	      reorg_redirect_jump (delay_insn, target_label);
+	      next = insn;
+	      continue;
 	    }
 	}
 
@@ -3640,8 +3652,7 @@ relax_delay_slots (rtx first)
 	 a RETURN here.  */
       if (! INSN_ANNULLED_BRANCH_P (delay_insn)
 	  && any_condjump_p (delay_insn)
-	  && next && JUMP_P (next)
-	  && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN)
+	  && next && simplejump_or_return_p (next)
 	  && next_active_insn (target_label) == next_active_insn (next)
 	  && no_labels_between_p (insn, next))
 	{
@@ -3649,7 +3660,7 @@ relax_delay_slots (rtx first)
 	  rtx old_label = JUMP_LABEL (delay_insn);
 
 	  if (ANY_RETURN_P (label))
-	    label = find_end_label ();
+	    label = find_end_label (label);
 
 	  /* find_end_label can generate a new label. Check this first.  */
 	  if (label
@@ -3710,7 +3721,8 @@ static void
 make_return_insns (rtx first)
 {
   rtx insn, jump_insn, pat;
-  rtx real_return_label = end_of_function_label;
+  rtx real_return_label = function_return_label;
+  rtx real_simple_return_label = function_simple_return_label;
   int slots, i;
 
 #ifdef DELAY_SLOTS_FOR_EPILOGUE
@@ -3728,15 +3740,22 @@ make_return_insns (rtx first)
      made for END_OF_FUNCTION_LABEL.  If so, set up anything we can't change
      into a RETURN to jump to it.  */
   for (insn = first; insn; insn = NEXT_INSN (insn))
-    if (JUMP_P (insn) && GET_CODE (PATTERN (insn)) == RETURN)
+    if (JUMP_P (insn) && ANY_RETURN_P (PATTERN (insn)))
       {
-	real_return_label = get_label_before (insn);
+	rtx t = get_label_before (insn);
+	if (PATTERN (insn) == ret_rtx)
+	  real_return_label = t;
+	else
+	  real_simple_return_label = t;
 	break;
       }
 
   /* Show an extra usage of REAL_RETURN_LABEL so it won't go away if it
      was equal to END_OF_FUNCTION_LABEL.  */
-  LABEL_NUSES (real_return_label)++;
+  if (real_return_label)
+    LABEL_NUSES (real_return_label)++;
+  if (real_simple_return_label)
+    LABEL_NUSES (real_simple_return_label)++;
 
   /* Clear the list of insns to fill so we can use it.  */
   obstack_free (&unfilled_slots_obstack, unfilled_firstobj);
@@ -3744,13 +3763,27 @@ make_return_insns (rtx first)
   for (insn = first; insn; insn = NEXT_INSN (insn))
     {
       int flags;
+      rtx kind, real_label;
 
       /* Only look at filled JUMP_INSNs that go to the end of function
 	 label.  */
       if (!NONJUMP_INSN_P (insn)
 	  || GET_CODE (PATTERN (insn)) != SEQUENCE
-	  || !JUMP_P (XVECEXP (PATTERN (insn), 0, 0))
-	  || JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) != end_of_function_label)
+	  || !jump_to_label_p (XVECEXP (PATTERN (insn), 0, 0)))
+	continue;
+
+      if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) == function_return_label)
+	{
+	  kind = ret_rtx;
+	  real_label = real_return_label;
+	}
+      else if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0))
+	       == function_simple_return_label)
+	{
+	  kind = simple_return_rtx;
+	  real_label = real_simple_return_label;
+	}
+      else
 	continue;
 
       pat = PATTERN (insn);
@@ -3758,14 +3791,12 @@ make_return_insns (rtx first)
 
       /* If we can't make the jump into a RETURN, try to redirect it to the best
 	 RETURN and go on to the next insn.  */
-      if (! reorg_redirect_jump (jump_insn, ret_rtx))
+      if (!reorg_redirect_jump (jump_insn, kind))
 	{
 	  /* Make sure redirecting the jump will not invalidate the delay
 	     slot insns.  */
-	  if (redirect_with_delay_slots_safe_p (jump_insn,
-						real_return_label,
-						insn))
-	    reorg_redirect_jump (jump_insn, real_return_label);
+	  if (redirect_with_delay_slots_safe_p (jump_insn, real_label, insn))
+	    reorg_redirect_jump (jump_insn, real_label);
 	  continue;
 	}
 
@@ -3805,7 +3836,7 @@ make_return_insns (rtx first)
 	 RETURN, delete the SEQUENCE and output the individual insns,
 	 followed by the RETURN.  Then set things up so we try to find
 	 insns for its delay slots, if it needs some.  */
-      if (GET_CODE (PATTERN (jump_insn)) == RETURN)
+      if (ANY_RETURN_P (PATTERN (jump_insn)))
 	{
 	  rtx prev = PREV_INSN (insn);
 
@@ -3822,13 +3853,16 @@ make_return_insns (rtx first)
       else
 	/* It is probably more efficient to keep this with its current
 	   delay slot as a branch to a RETURN.  */
-	reorg_redirect_jump (jump_insn, real_return_label);
+	reorg_redirect_jump (jump_insn, real_label);
     }
 
   /* Now delete REAL_RETURN_LABEL if we never used it.  Then try to fill any
      new delay slots we have created.  */
-  if (--LABEL_NUSES (real_return_label) == 0)
+  if (real_return_label != NULL_RTX && --LABEL_NUSES (real_return_label) == 0)
     delete_related_insns (real_return_label);
+  if (real_simple_return_label != NULL_RTX
+      && --LABEL_NUSES (real_simple_return_label) == 0)
+    delete_related_insns (real_simple_return_label);
 
   fill_simple_delay_slots (1);
   fill_simple_delay_slots (0);
@@ -3842,6 +3876,7 @@ dbr_schedule (rtx first)
 {
   rtx insn, next, epilogue_insn = 0;
   int i;
+  bool need_return_insns;
 
   /* If the current function has no insns other than the prologue and
      epilogue, then do not try to fill any delay slots.  */
@@ -3897,7 +3932,7 @@ dbr_schedule (rtx first)
   init_resource_info (epilogue_insn);
 
   /* Show we haven't computed an end-of-function label yet.  */
-  end_of_function_label = 0;
+  function_return_label = function_simple_return_label = NULL_RTX;
 
   /* Initialize the statistics for this function.  */
   memset (num_insns_needing_delays, 0, sizeof num_insns_needing_delays);
@@ -3919,13 +3954,21 @@ dbr_schedule (rtx first)
   /* If we made an end of function label, indicate that it is now
      safe to delete it by undoing our prior adjustment to LABEL_NUSES.
      If it is now unused, delete it.  */
-  if (end_of_function_label && --LABEL_NUSES (end_of_function_label) == 0)
-    delete_related_insns (end_of_function_label);
+  if (function_return_label && --LABEL_NUSES (function_return_label) == 0)
+    delete_related_insns (function_return_label);
+  if (function_simple_return_label
+      && --LABEL_NUSES (function_simple_return_label) == 0)
+    delete_related_insns (function_simple_return_label);
 
+  need_return_insns = false;
 #ifdef HAVE_return
-  if (HAVE_return && end_of_function_label != 0)
-    make_return_insns (first);
+  need_return_insns |= HAVE_return && function_return_label != 0;
 #endif
+#ifdef HAVE_simple_return
+  need_return_insns |= HAVE_simple_return && function_simple_return_label != 0;
+#endif
+  if (need_return_insns)
+    make_return_insns (first);
 
   /* Delete any USE insns made by update_block; subsequent passes don't need
      them or know how to deal with them.  */
Index: gcc/genemit.c
===================================================================
--- gcc/genemit.c	(revision 177999)
+++ gcc/genemit.c	(working copy)
@@ -169,6 +169,9 @@ gen_exp (rtx x, enum rtx_code subroutine
     case RETURN:
       printf ("ret_rtx");
       return;
+    case SIMPLE_RETURN:
+      printf ("simple_return_rtx");
+      return;
     case CLOBBER:
       if (REG_P (XEXP (x, 0)))
 	{
@@ -489,8 +492,8 @@ gen_expand (rtx expand)
 	  || (GET_CODE (next) == PARALLEL
 	      && ((GET_CODE (XVECEXP (next, 0, 0)) == SET
 		   && GET_CODE (SET_DEST (XVECEXP (next, 0, 0))) == PC)
-		  || GET_CODE (XVECEXP (next, 0, 0)) == RETURN))
-	  || GET_CODE (next) == RETURN)
+		  || ANY_RETURN_P (XVECEXP (next, 0, 0))))
+	  || ANY_RETURN_P (next))
 	printf ("  emit_jump_insn (");
       else if ((GET_CODE (next) == SET && GET_CODE (SET_SRC (next)) == CALL)
 	       || GET_CODE (next) == CALL
@@ -607,7 +610,7 @@ gen_split (rtx split)
 	  || (GET_CODE (next) == PARALLEL
 	      && GET_CODE (XVECEXP (next, 0, 0)) == SET
 	      && GET_CODE (SET_DEST (XVECEXP (next, 0, 0))) == PC)
-	  || GET_CODE (next) == RETURN)
+	  || ANY_RETURN_P (next))
 	printf ("  emit_jump_insn (");
       else if ((GET_CODE (next) == SET && GET_CODE (SET_SRC (next)) == CALL)
 	       || GET_CODE (next) == CALL
Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c	(revision 177999)
+++ gcc/df-scan.c	(working copy)
@@ -3181,6 +3181,7 @@ df_uses_record (struct df_collection_rec
       }
 
     case RETURN:
+    case SIMPLE_RETURN:
       break;
 
     case ASM_OPERANDS:
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def	(revision 177999)
+++ gcc/rtl.def	(working copy)
@@ -731,6 +731,10 @@ DEF_RTL_EXPR(ENTRY_VALUE, "entry_value",
    been optimized away completely.  */
 DEF_RTL_EXPR(DEBUG_PARAMETER_REF, "debug_parameter_ref", "t", RTX_OBJ)
 
+/* A plain return, to be used on paths that are reached without going
+   through the function prologue.  */
+DEF_RTL_EXPR(SIMPLE_RETURN, "simple_return", "", RTX_EXTRA)
+
 /* All expressions from this point forward appear only in machine
    descriptions.  */
 #ifdef GENERATOR_FILE
Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 177999)
+++ gcc/ifcvt.c	(working copy)
@@ -3796,6 +3796,7 @@ find_if_case_1 (basic_block test_bb, edg
   basic_block then_bb = then_edge->dest;
   basic_block else_bb = else_edge->dest;
   basic_block new_bb;
+  rtx else_target = NULL_RTX;
   int then_bb_index;
 
   /* If we are partitioning hot/cold basic blocks, we don't want to
@@ -3845,6 +3846,13 @@ find_if_case_1 (basic_block test_bb, edg
 				    predictable_edge_p (then_edge)))))
     return FALSE;
 
+  if (else_bb == EXIT_BLOCK_PTR)
+    {
+      rtx jump = BB_END (else_edge->src);
+      gcc_assert (JUMP_P (jump));
+      else_target = JUMP_LABEL (jump);
+    }
+
   /* Registers set are dead, or are predicable.  */
   if (! dead_or_predicable (test_bb, then_bb, else_bb,
 			    single_succ_edge (then_bb), 1))
@@ -3864,6 +3872,9 @@ find_if_case_1 (basic_block test_bb, edg
       redirect_edge_succ (FALLTHRU_EDGE (test_bb), else_bb);
       new_bb = 0;
     }
+  else if (else_bb == EXIT_BLOCK_PTR)
+    new_bb = force_nonfallthru_and_redirect (FALLTHRU_EDGE (test_bb),
+					     else_bb, else_target);
   else
     new_bb = redirect_edge_and_branch_force (FALLTHRU_EDGE (test_bb),
 					     else_bb);
Index: gcc/jump.c
===================================================================
--- gcc/jump.c	(revision 177999)
+++ gcc/jump.c	(working copy)
@@ -29,7 +29,8 @@ along with GCC; see the file COPYING3.
    JUMP_LABEL internal field.  With this we can detect labels that
    become unused because of the deletion of all the jumps that
    formerly used them.  The JUMP_LABEL info is sometimes looked
-   at by later passes.
+   at by later passes.  For return insns, it contains either a
+   RETURN or a SIMPLE_RETURN rtx.
 
    The subroutines redirect_jump and invert_jump are used
    from other passes as well.  */
@@ -775,10 +776,10 @@ condjump_p (const_rtx insn)
     return (GET_CODE (x) == IF_THEN_ELSE
 	    && ((GET_CODE (XEXP (x, 2)) == PC
 		 && (GET_CODE (XEXP (x, 1)) == LABEL_REF
-		     || GET_CODE (XEXP (x, 1)) == RETURN))
+		     || ANY_RETURN_P (XEXP (x, 1))))
 		|| (GET_CODE (XEXP (x, 1)) == PC
 		    && (GET_CODE (XEXP (x, 2)) == LABEL_REF
-			|| GET_CODE (XEXP (x, 2)) == RETURN))));
+			|| ANY_RETURN_P (XEXP (x, 2))))));
 }
 
 /* Return nonzero if INSN is a (possibly) conditional jump inside a
@@ -807,11 +808,11 @@ condjump_in_parallel_p (const_rtx insn)
     return 0;
   if (XEXP (SET_SRC (x), 2) == pc_rtx
       && (GET_CODE (XEXP (SET_SRC (x), 1)) == LABEL_REF
-	  || GET_CODE (XEXP (SET_SRC (x), 1)) == RETURN))
+	  || ANY_RETURN_P (XEXP (SET_SRC (x), 1))))
     return 1;
   if (XEXP (SET_SRC (x), 1) == pc_rtx
       && (GET_CODE (XEXP (SET_SRC (x), 2)) == LABEL_REF
-	  || GET_CODE (XEXP (SET_SRC (x), 2)) == RETURN))
+	  || ANY_RETURN_P (XEXP (SET_SRC (x), 2))))
     return 1;
   return 0;
 }
@@ -873,8 +874,9 @@ any_condjump_p (const_rtx insn)
   a = GET_CODE (XEXP (SET_SRC (x), 1));
   b = GET_CODE (XEXP (SET_SRC (x), 2));
 
-  return ((b == PC && (a == LABEL_REF || a == RETURN))
-	  || (a == PC && (b == LABEL_REF || b == RETURN)));
+  return ((b == PC && (a == LABEL_REF || a == RETURN || a == SIMPLE_RETURN))
+	  || (a == PC
+	      && (b == LABEL_REF || b == RETURN || b == SIMPLE_RETURN)));
 }
 
 /* Return the label of a conditional jump.  */
@@ -911,6 +913,7 @@ returnjump_p_1 (rtx *loc, void *data ATT
   switch (GET_CODE (x))
     {
     case RETURN:
+    case SIMPLE_RETURN:
     case EH_RETURN:
       return true;
 
Index: gcc/function.c
===================================================================
--- gcc/function.c	(revision 177999)
+++ gcc/function.c	(working copy)
@@ -5306,7 +5306,11 @@ static void
 emit_return_into_block (basic_block bb)
 {
   rtx jump = emit_jump_insn_after (gen_return (), BB_END (bb));
-  JUMP_LABEL (jump) = ret_rtx;
+  rtx pat = PATTERN (jump);
+  if (GET_CODE (pat) == PARALLEL)
+    pat = XVECEXP (pat, 0, 0);
+  gcc_assert (ANY_RETURN_P (pat));
+  JUMP_LABEL (jump) = pat;
 }
 #endif /* HAVE_return */
 
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c	(revision 177999)
+++ gcc/print-rtl.c	(working copy)
@@ -328,6 +328,8 @@ print_rtx (const_rtx in_rtx)
 	    fprintf (outfile, "\n%s%*s -> ", print_rtx_head, indent * 2, "");
 	    if (GET_CODE (JUMP_LABEL (in_rtx)) == RETURN)
 	      fprintf (outfile, "return");
+	    else if (GET_CODE (JUMP_LABEL (in_rtx)) == SIMPLE_RETURN)
+	      fprintf (outfile, "simple_return");
 	    else
 	      fprintf (outfile, "%d", INSN_UID (JUMP_LABEL (in_rtx)));
 	  }
Index: gcc/bt-load.c
===================================================================
--- gcc/bt-load.c	(revision 177999)
+++ gcc/bt-load.c	(working copy)
@@ -558,7 +558,7 @@ compute_defs_uses_and_gen (fibheap_t all
 		      /* Check for sibcall.  */
 		      if (GET_CODE (pat) == PARALLEL)
 			for (i = XVECLEN (pat, 0) - 1; i >= 0; i--)
-			  if (GET_CODE (XVECEXP (pat, 0, i)) == RETURN)
+			  if (ANY_RETURN_P (XVECEXP (pat, 0, i)))
 			    {
 			      COMPL_HARD_REG_SET (call_saved,
 						  call_used_reg_set);
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c	(revision 177999)
+++ gcc/emit-rtl.c	(working copy)
@@ -2518,6 +2518,7 @@ verify_rtx_sharing (rtx orig, rtx insn)
     case PC:
     case CC0:
     case RETURN:
+    case SIMPLE_RETURN:
     case SCRATCH:
       return;
       /* SCRATCH must be shared because they represent distinct values.  */
@@ -2725,6 +2726,7 @@ repeat:
     case PC:
     case CC0:
     case RETURN:
+    case SIMPLE_RETURN:
     case SCRATCH:
       /* SCRATCH must be shared because they represent distinct values.  */
       return;
@@ -2845,6 +2847,7 @@ repeat:
     case PC:
     case CC0:
     case RETURN:
+    case SIMPLE_RETURN:
       return;
 
     case DEBUG_INSN:
@@ -5008,7 +5011,7 @@ classify_insn (rtx x)
     return CODE_LABEL;
   if (GET_CODE (x) == CALL)
     return CALL_INSN;
-  if (GET_CODE (x) == RETURN)
+  if (ANY_RETURN_P (x))
     return JUMP_INSN;
   if (GET_CODE (x) == SET)
     {
@@ -5264,6 +5267,7 @@ copy_insn_1 (rtx orig)
     case PC:
     case CC0:
     case RETURN:
+    case SIMPLE_RETURN:
       return orig;
     case CLOBBER:
       if (REG_P (XEXP (orig, 0)) && REGNO (XEXP (orig, 0)) < FIRST_PSEUDO_REGISTER)
@@ -5521,6 +5525,7 @@ init_emit_regs (void)
   /* Assign register numbers to the globally defined register rtx.  */
   pc_rtx = gen_rtx_fmt_ (PC, VOIDmode);
   ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode);
+  simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode);
   cc0_rtx = gen_rtx_fmt_ (CC0, VOIDmode);
   stack_pointer_rtx = gen_raw_REG (Pmode, STACK_POINTER_REGNUM);
   frame_pointer_rtx = gen_raw_REG (Pmode, FRAME_POINTER_REGNUM);
Index: gcc/cfglayout.c
===================================================================
--- gcc/cfglayout.c	(revision 177999)
+++ gcc/cfglayout.c	(working copy)
@@ -767,6 +767,7 @@ fixup_reorder_chain (void)
     {
       edge e_fall, e_taken, e;
       rtx bb_end_insn;
+      rtx ret_label = NULL_RTX;
       basic_block nb, src_bb;
       edge_iterator ei;
 
@@ -786,6 +787,7 @@ fixup_reorder_chain (void)
       bb_end_insn = BB_END (bb);
       if (JUMP_P (bb_end_insn))
 	{
+	  ret_label = JUMP_LABEL (bb_end_insn);
 	  if (any_condjump_p (bb_end_insn))
 	    {
 	      /* This might happen if the conditional jump has side
@@ -899,7 +901,7 @@ fixup_reorder_chain (void)
 	 Note force_nonfallthru can delete E_FALL and thus we have to
 	 save E_FALL->src prior to the call to force_nonfallthru.  */
       src_bb = e_fall->src;
-      nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest);
+      nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label);
       if (nb)
 	{
 	  nb->il.rtl->visited = 1;
Index: gcc/rtl.c
===================================================================
--- gcc/rtl.c	(revision 177999)
+++ gcc/rtl.c	(working copy)
@@ -256,6 +256,7 @@ copy_rtx (rtx orig)
     case PC:
     case CC0:
     case RETURN:
+    case SIMPLE_RETURN:
     case SCRATCH:
       /* SCRATCH must be shared because they represent distinct values.  */
       return orig;
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	(revision 177999)
+++ gcc/rtl.h	(working copy)
@@ -432,8 +432,9 @@ struct GTY((variable_size)) rtvec_def {
   (JUMP_P (INSN) && (GET_CODE (PATTERN (INSN)) == ADDR_VEC || \
 		     GET_CODE (PATTERN (INSN)) == ADDR_DIFF_VEC))
 
-/* Predicate yielding nonzero iff X is a return.  */
-#define ANY_RETURN_P(X) ((X) == ret_rtx)
+/* Predicate yielding nonzero iff X is a return or simple_return.  */
+#define ANY_RETURN_P(X) \
+  (GET_CODE (X) == RETURN || GET_CODE (X) == SIMPLE_RETURN)
 
 /* 1 if X is a unary operator.  */
 
@@ -2111,6 +2112,7 @@ enum global_rtl_index
   GR_PC,
   GR_CC0,
   GR_RETURN,
+  GR_SIMPLE_RETURN,
   GR_STACK_POINTER,
   GR_FRAME_POINTER,
 /* For register elimination to work properly these hard_frame_pointer_rtx,
@@ -2206,6 +2208,7 @@ extern struct target_rtl *this_target_rt
 /* Standard pieces of rtx, to be substituted directly into things.  */
 #define pc_rtx                  (global_rtl[GR_PC])
 #define ret_rtx                 (global_rtl[GR_RETURN])
+#define simple_return_rtx       (global_rtl[GR_SIMPLE_RETURN])
 #define cc0_rtx                 (global_rtl[GR_CC0])
 
 /* All references to certain hard regs, except those created
Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(revision 177999)
+++ gcc/combine.c	(working copy)
@@ -6303,7 +6303,7 @@ simplify_set (rtx x)
   rtx *cc_use;
 
   /* (set (pc) (return)) gets written as (return).  */
-  if (GET_CODE (dest) == PC && GET_CODE (src) == RETURN)
+  if (GET_CODE (dest) == PC && ANY_RETURN_P (src))
     return src;
 
   /* Now that we know for sure which bits of SRC we are using, see if we can
Index: gcc/resource.c
===================================================================
--- gcc/resource.c	(revision 177999)
+++ gcc/resource.c	(working copy)
@@ -492,7 +492,7 @@ find_dead_or_set_registers (rtx target,
 	  if (jump_count++ < 10)
 	    {
 	      if (any_uncondjump_p (this_jump_insn)
-		  || GET_CODE (PATTERN (this_jump_insn)) == RETURN)
+		  || ANY_RETURN_P (PATTERN (this_jump_insn)))
 		{
 		  next = JUMP_LABEL (this_jump_insn);
 		  if (ANY_RETURN_P (next))
@@ -829,7 +829,7 @@ mark_set_resources (rtx x, struct resour
 static bool
 return_insn_p (const_rtx insn)
 {
-  if (JUMP_P (insn) && GET_CODE (PATTERN (insn)) == RETURN)
+  if (JUMP_P (insn) && ANY_RETURN_P (PATTERN (insn)))
     return true;
 
   if (NONJUMP_INSN_P (insn) && GET_CODE (PATTERN (insn)) == SEQUENCE)
Index: gcc/basic-block.h
===================================================================
--- gcc/basic-block.h	(revision 177999)
+++ gcc/basic-block.h	(working copy)
@@ -804,7 +804,7 @@ extern rtx block_label (basic_block);
 extern bool purge_all_dead_edges (void);
 extern bool purge_dead_edges (basic_block);
 extern bool fixup_abnormal_edges (void);
-extern basic_block force_nonfallthru_and_redirect (edge, basic_block);
+extern basic_block force_nonfallthru_and_redirect (edge, basic_block, rtx);
 
 /* In cfgbuild.c.  */
 extern void find_many_sub_basic_blocks (sbitmap);
Index: gcc/sched-vis.c
===================================================================
--- gcc/sched-vis.c	(revision 177999)
+++ gcc/sched-vis.c	(working copy)
@@ -554,6 +554,9 @@ print_pattern (char *buf, const_rtx x, i
     case RETURN:
       sprintf (buf, "return");
       break;
+    case SIMPLE_RETURN:
+      sprintf (buf, "simple_return");
+      break;
     case CALL:
       print_exp (buf, x, verbose);
       break;
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 177999)
+++ gcc/config/i386/i386.c	(working copy)
@@ -30545,7 +30545,7 @@ ix86_pad_returns (void)
       rtx prev;
       bool replace = false;
 
-      if (!JUMP_P (ret) || GET_CODE (PATTERN (ret)) != RETURN
+      if (!JUMP_P (ret) || !ANY_RETURN_P (PATTERN (ret))
 	  || optimize_bb_for_size_p (bb))
 	continue;
       for (prev = PREV_INSN (ret); prev; prev = PREV_INSN (prev))
@@ -30596,7 +30596,7 @@ ix86_count_insn_bb (basic_block bb)
     {
       /* Only happen in exit blocks.  */
       if (JUMP_P (insn)
-	  && GET_CODE (PATTERN (insn)) == RETURN)
+	  && ANY_RETURN_P (PATTERN (insn)))
 	break;
 
       if (NONDEBUG_INSN_P (insn)
@@ -30669,7 +30669,7 @@ ix86_pad_short_function (void)
   FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR->preds)
     {
       rtx ret = BB_END (e->src);
-      if (JUMP_P (ret) && GET_CODE (PATTERN (ret)) == RETURN)
+      if (JUMP_P (ret) && ANY_RETURN_P (PATTERN (ret)))
 	{
 	  int insn_count = ix86_count_insn (e->src);
 
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 177999)
+++ gcc/config/arm/arm.c	(working copy)
@@ -17723,6 +17723,7 @@ arm_final_prescan_insn (rtx insn)
 
   /* If we start with a return insn, we only succeed if we find another one.  */
   int seeking_return = 0;
+  enum rtx_code return_code = UNKNOWN;
 
   /* START_INSN will hold the insn from where we start looking.  This is the
      first insn after the following code_label if REVERSE is true.  */
@@ -17761,7 +17762,7 @@ arm_final_prescan_insn (rtx insn)
 	  else
 	    return;
 	}
-      else if (GET_CODE (body) == RETURN)
+      else if (ANY_RETURN_P (body))
         {
 	  start_insn = next_nonnote_insn (start_insn);
 	  if (GET_CODE (start_insn) == BARRIER)
@@ -17772,6 +17773,7 @@ arm_final_prescan_insn (rtx insn)
 	    {
 	      reverse = TRUE;
 	      seeking_return = 1;
+	      return_code = GET_CODE (body);
 	    }
 	  else
 	    return;
@@ -17812,11 +17814,15 @@ arm_final_prescan_insn (rtx insn)
 	  label = XEXP (XEXP (SET_SRC (body), 2), 0);
 	  then_not_else = FALSE;
 	}
-      else if (GET_CODE (XEXP (SET_SRC (body), 1)) == RETURN)
-	seeking_return = 1;
-      else if (GET_CODE (XEXP (SET_SRC (body), 2)) == RETURN)
+      else if (ANY_RETURN_P (XEXP (SET_SRC (body), 1)))
+	{
+	  seeking_return = 1;
+	  return_code = GET_CODE (XEXP (SET_SRC (body), 1));
+	}
+      else if (ANY_RETURN_P (XEXP (SET_SRC (body), 2)))
         {
 	  seeking_return = 1;
+	  return_code = GET_CODE (XEXP (SET_SRC (body), 2));
 	  then_not_else = FALSE;
         }
       else
@@ -17913,12 +17919,11 @@ arm_final_prescan_insn (rtx insn)
 		}
 	      /* Fail if a conditional return is undesirable (e.g. on a
 		 StrongARM), but still allow this if optimizing for size.  */
-	      else if (GET_CODE (scanbody) == RETURN
+	      else if (GET_CODE (scanbody) == return_code
 		       && !use_return_insn (TRUE, NULL)
 		       && !optimize_size)
 		fail = TRUE;
-	      else if (GET_CODE (scanbody) == RETURN
-		       && seeking_return)
+	      else if (GET_CODE (scanbody) == return_code)
 	        {
 		  arm_ccfsm_state = 2;
 		  succeed = TRUE;
Index: gcc/config/mips/mips.md
===================================================================
--- gcc/config/mips/mips.md	(revision 177999)
+++ gcc/config/mips/mips.md	(working copy)
@@ -777,6 +777,8 @@ (define_code_iterator any_ge [ge geu])
 (define_code_iterator any_lt [lt ltu])
 (define_code_iterator any_le [le leu])
 
+(define_code_iterator any_return [return simple_return])
+
 ;; <u> expands to an empty string when doing a signed operation and
 ;; "u" when doing an unsigned operation.
 (define_code_attr u [(sign_extend "") (zero_extend "u")
@@ -798,7 +800,9 @@ (define_code_attr optab [(ashift "ashl")
 			 (xor "xor")
 			 (and "and")
 			 (plus "add")
-			 (minus "sub")])
+			 (minus "sub")
+			 (return "return")
+			 (simple_return "simple_return")])
 
 ;; <insn> expands to the name of the insn that implements a particular code.
 (define_code_attr insn [(ashift "sll")
@@ -5713,21 +5717,26 @@ (define_expand "sibcall_epilogue"
 ;; allows jump optimizations to work better.
 
 (define_expand "return"
-  [(return)]
+  [(simple_return)]
   "mips_can_use_return_insn ()"
   { mips_expand_before_return (); })
 
-(define_insn "*return"
-  [(return)]
-  "mips_can_use_return_insn ()"
+(define_expand "simple_return"
+  [(simple_return)]
+  ""
+  { mips_expand_before_return (); })
+
+(define_insn "*<optab>"
+  [(any_return)]
+  ""
   "%*j\t$31%/"
   [(set_attr "type"	"jump")
    (set_attr "mode"	"none")])
 
 ;; Normal return.
 
-(define_insn "return_internal"
-  [(return)
+(define_insn "<optab>_internal"
+  [(any_return)
    (use (match_operand 0 "pmode_register_operand" ""))]
   ""
   "%*j\t%0%/"
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	(revision 177999)
+++ gcc/config/mips/mips.c	(working copy)
@@ -10453,7 +10453,8 @@ mips_expand_epilogue (bool sibcall_p)
 	    regno = GP_REG_FIRST + 7;
 	  else
 	    regno = RETURN_ADDR_REGNUM;
-	  emit_jump_insn (gen_return_internal (gen_rtx_REG (Pmode, regno)));
+	  emit_jump_insn (gen_simple_return_internal (gen_rtx_REG (Pmode,
+								   regno)));
 	}
     }
 
Index: gcc/cfgrtl.c
===================================================================
--- gcc/cfgrtl.c	(revision 177999)
+++ gcc/cfgrtl.c	(working copy)
@@ -1117,10 +1117,13 @@ rtl_redirect_edge_and_branch (edge e, ba
 }
 
 /* Like force_nonfallthru below, but additionally performs redirection
-   Used by redirect_edge_and_branch_force.  */
+   Used by redirect_edge_and_branch_force.  JUMP_LABEL is used only
+   when redirecting to the EXIT_BLOCK, it is either ret_rtx or
+   simple_return_rtx, indicating which kind of returnjump to create.
+   It should be NULL otherwise.  */
 
 basic_block
-force_nonfallthru_and_redirect (edge e, basic_block target)
+force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
 {
   basic_block jump_block, new_bb = NULL, src = e->src;
   rtx note;
@@ -1252,12 +1255,25 @@ force_nonfallthru_and_redirect (edge e,
   e->flags &= ~EDGE_FALLTHRU;
   if (target == EXIT_BLOCK_PTR)
     {
+      if (jump_label == ret_rtx)
+	{
 #ifdef HAVE_return
-	emit_jump_insn_after_setloc (gen_return (), BB_END (jump_block), loc);
-	JUMP_LABEL (BB_END (jump_block)) = ret_rtx;
+	  emit_jump_insn_after_setloc (gen_return (), BB_END (jump_block), loc);
 #else
-	gcc_unreachable ();
+	  gcc_unreachable ();
+#endif
+	}
+      else
+	{
+	  gcc_assert (jump_label == simple_return_rtx);
+#ifdef HAVE_simple_return
+	  emit_jump_insn_after_setloc (gen_simple_return (),
+				       BB_END (jump_block), loc);
+#else
+	  gcc_unreachable ();
 #endif
+	}
+      JUMP_LABEL (BB_END (jump_block)) = jump_label;
     }
   else
     {
@@ -1284,7 +1300,7 @@ force_nonfallthru_and_redirect (edge e,
 static basic_block
 rtl_force_nonfallthru (edge e)
 {
-  return force_nonfallthru_and_redirect (e, e->dest);
+  return force_nonfallthru_and_redirect (e, e->dest, NULL_RTX);
 }
 
 /* Redirect edge even at the expense of creating new jump insn or
@@ -1301,7 +1317,7 @@ rtl_redirect_edge_and_branch_force (edge
   /* In case the edge redirection failed, try to force it to be non-fallthru
      and redirect newly created simplejump.  */
   df_set_bb_dirty (e->src);
-  return force_nonfallthru_and_redirect (e, target);
+  return force_nonfallthru_and_redirect (e, target, NULL_RTX);
 }
 
 /* The given edge should potentially be a fallthru edge.  If that is in
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 177999)
+++ gcc/doc/md.texi	(working copy)
@@ -4992,6 +4992,20 @@ some class of functions only requires on
 return.  Normally, the applicable functions are those which do not need
 to save any registers or allocate stack space.
 
+It is valid for this pattern to expand to an instruction using
+@code{simple_return} if no epilogue is required.
+
+@cindex @code{simple_return} instruction pattern
+@item @samp{simple_return}
+Subroutine return instruction.  This instruction pattern name should be
+defined only if a single instruction can do all the work of returning
+from a function on a path where no epilogue is required.  This pattern
+is very similar to the @code{return} instruction pattern, but it is emitted
+only by the shrink-wrapping optimization on paths where the function
+prologue has not been executed, and a function return should occur without
+any of the effects of the epilogue.  Additional uses may be introduced on
+paths where both the prologue and the epilogue have executed.
+
 @findex reload_completed
 @findex leaf_function_p
 For such machines, the condition specified in this pattern should only

  reply	other threads:[~2011-08-24 16:48 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-23 14:44 Shrink-wrapping: Introduction Bernd Schmidt
2011-03-23 14:46 ` [PATCH 1/6] Disallow predicating the prologue Bernd Schmidt
2011-03-31 13:20   ` Jeff Law
2011-04-01 18:59   ` H.J. Lu
2011-04-01 21:08     ` Bernd Schmidt
2011-03-23 14:48 ` [PATCH 2/6] Unique return rtx Bernd Schmidt
2011-03-31 13:23   ` Jeff Law
2011-05-03 11:54     ` Bernd Schmidt
2011-03-23 14:51 ` [PATCH 3/6] Allow jumps in epilogues Bernd Schmidt
2011-03-23 16:46   ` Richard Henderson
2011-03-23 16:49     ` Bernd Schmidt
2011-03-23 17:19       ` Richard Henderson
2011-03-23 17:24         ` Bernd Schmidt
2011-03-23 17:27           ` Richard Henderson
2011-03-24 10:30             ` Bernd Schmidt
2011-03-25 17:51         ` Bernd Schmidt
2011-03-26  5:33           ` Richard Henderson
2011-03-31 20:09             ` Bernd Schmidt
2011-03-31 21:51               ` Richard Henderson
2011-03-31 22:36                 ` Bernd Schmidt
2011-03-31 23:57                   ` Richard Henderson
2011-04-05 21:59                 ` Bernd Schmidt
2011-04-11 17:10                   ` Richard Henderson
2011-04-13 14:16                     ` Bernd Schmidt
2011-04-13 15:14                       ` Bernd Schmidt
2011-04-13 15:16                       ` Bernd Schmidt
2011-04-13 15:17                       ` Bernd Schmidt
2011-04-13 15:28                     ` Bernd Schmidt
2011-04-13 14:44                       ` Richard Henderson
2011-04-13 14:54                         ` Jakub Jelinek
2011-04-15 16:29                       ` Bernd Schmidt
2011-03-23 14:56 ` [PATCH 4/6] Shrink-wrapping Bernd Schmidt
2011-07-07 14:51   ` Richard Sandiford
2011-07-07 15:40     ` Bernd Schmidt
2011-07-07 17:00       ` Paul Koning
2011-07-07 17:02         ` Jeff Law
2011-07-07 17:05           ` Paul Koning
2011-07-07 17:08             ` Jeff Law
2011-07-07 17:30             ` Bernd Schmidt
2011-07-08 22:59             ` [pdp11] Emit prologue as rtl Richard Henderson
2011-07-09 13:46               ` Paul Koning
2011-07-09 16:53                 ` Richard Henderson
2011-07-07 15:57     ` [PATCH 4/6] Shrink-wrapping Richard Earnshaw
2011-07-07 20:19       ` Richard Sandiford
2011-07-08  8:30         ` Richard Earnshaw
2011-07-08 13:57         ` Bernd Schmidt
2011-07-11 11:24           ` Richard Sandiford
2011-07-11 11:42             ` Bernd Schmidt
2011-07-21  3:57     ` Bernd Schmidt
2011-07-21 11:25       ` Richard Sandiford
2011-07-28 11:48         ` Bernd Schmidt
2011-07-28 12:45           ` Richard Sandiford
2011-07-28 23:30           ` Richard Earnshaw
2011-07-29 12:40             ` Bernd Schmidt
2011-08-03 10:42           ` Alan Modra
2011-08-03 11:19             ` Bernd Schmidt
2011-08-02  8:40     ` Bernd Schmidt
2011-08-03 15:39       ` Richard Sandiford
2011-08-24 19:23         ` Bernd Schmidt [this message]
2011-08-24 20:48           ` Richard Sandiford
2011-08-24 20:55             ` Bernd Schmidt
2011-08-26 14:49               ` Ramana Radhakrishnan
2011-08-26 14:58                 ` Bernd Schmidt
2011-08-26 15:06                   ` Ramana Radhakrishnan
2011-08-28 10:58           ` H.J. Lu
2011-07-07 21:41   ` Michael Hope
2011-03-23 14:56 ` [PATCH 5/6] Generate more shrink-wrapping opportunities Bernd Schmidt
2011-03-23 15:03   ` Jeff Law
2011-03-23 15:05     ` Bernd Schmidt
2011-03-23 15:18       ` Jeff Law
2011-03-31 13:26   ` Jeff Law
2011-03-31 13:34     ` Bernd Schmidt
2011-03-23 14:57 ` [PATCH 6/6] A testcase Bernd Schmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E552ACB.8050702@codesourcery.com \
    --to=bernds@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.sandiford@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).