* Patch: PR40900, extending call patterns
@ 2010-04-30 1:05 Bernd Schmidt
2010-04-30 3:33 ` Andrew Pinski
0 siblings, 1 reply; 15+ messages in thread
From: Bernd Schmidt @ 2010-04-30 1:05 UTC (permalink / raw)
To: GCC Patches
[-- Attachment #1: Type: text/plain, Size: 684 bytes --]
The following patch fixes PR40900, which points out an unnecessary
sign-extension for the following code:
extern short shortv2();
short shortv1()
{
return shortv2();
}
The idea is to represent the sign-extension in the RTL call patterns,
choosing extending variants in calls.c when we notice a suitable
promotion. Then we let combine do its work; this needed another little
tweak to make sure it recorded the extension of the return register.
Bootstrapped and regression tested on i686-linux. An earlier version,
which had some unnecessary code and other small differences, was
regression tested on
arm-linux-gnueabi(qemu-system-armv7{arch=armv7-a/thumb,thumb,}). Ok?
Bernd
[-- Attachment #2: extend-calls3.diff --]
[-- Type: text/plain, Size: 16422 bytes --]
PR rtl-optimization/40900
* doc/md.texi (Standard Names): Document call_value_zext<mode,
call_value_sext<mode>, call_value_pop_sext<mode> and
call_value_pop_zext<mode>.
* optabs.h (enum optab_index): Add OTI_call_value_sext,
OTI_call_value_zext, OTI_call_value_pop_sext, OTI_call_value_po_zext.
(call_value_sext_optab, call_value_zext_optab,
call_value_pop_sext_optab, call_value_pop_zext_optab): New macros.
* genopinit.c (optabs): Initialize the new optabs.
* final.c (call_from_call_insn): Handle ZERO_EXTEND and SIGN_EXTEND.
* calls.c: Include "recog.h".
(emit_call_1): Look for and use extending call patterns if the return
value is promoted.
* combine.c (record_dead_and_set_regs): Use record_dead_and_set_regs_1
normally for call insns.
* config/arm/arm.md (define_code_iterator cext): New.
(define_code_attr optab): New.
(define_mode_iterator CEXT): New.
(call_value_<optab><mode>, call_value_<optab><mode>_internal,
call_value_<optab><mode>_reg_armv5, call_value_<optab><mode>_reg_arm,
call_value_<optab><mode>_mem, call_value_<optab><mode>_reg_thumb1_v5,
call_value_<optab><mode>_reg_thumb1, call_value_<optab><mode>_symbol,
call_value_<optab><mode>_insn): New patterns.
* config/arm/thumb2.md (call_value<optab><mode>_reg_thumb2): New
pattern.
PR rtl-optimization/40900
* gcc.target/arm/pr40900.c: New test.
Index: doc/md.texi
===================================================================
--- doc/md.texi (revision 158639)
+++ doc/md.texi (working copy)
@@ -4775,6 +4775,17 @@ For machines where @code{RETURN_POPS_ARG
patterns increases the number of functions for which the frame pointer
can be eliminated, if desired.
+@cindex @code{call_value_sext@var{mode}} instruction pattern
+@cindex @code{call_value_zext@var{mode}} instruction pattern
+@cindex @code{call_value_pop_sext@var{mode}} instruction pattern
+@cindex @code{call_value_pop_zext@var{mode}} instruction pattern
+
+These patterns correspond to @samp{call_value} and @samp{call_value_pop},
+but they describe a call which returns a sign-extended or zero-extended
+result. This may enable some optimization opportunities. Defining them
+is optional; if they don't exist, @samp{call_value} or @samp{call_value_pop}
+will be used in their place.
+
@cindex @code{untyped_call} instruction pattern
@item @samp{untyped_call}
Subroutine call instruction returning a value of any type. Operand 0 is
Index: optabs.h
===================================================================
--- optabs.h (revision 158643)
+++ optabs.h (working copy)
@@ -369,6 +369,12 @@ enum optab_index
/* Perform a raise to the power of integer. */
OTI_powi,
+ /* Perform a call where we know how the return value is extended. */
+ OTI_call_value_sext,
+ OTI_call_value_zext,
+ OTI_call_value_pop_sext,
+ OTI_call_value_pop_zext,
+
OTI_MAX
};
@@ -546,6 +552,11 @@ extern struct optab_d optab_table[OTI_MA
#define powi_optab (&optab_table[OTI_powi])
+#define call_value_sext_optab (&optab_table[OTI_call_value_sext])
+#define call_value_zext_optab (&optab_table[OTI_call_value_zext])
+#define call_value_pop_sext_optab (&optab_table[OTI_call_value_pop_sext])
+#define call_value_pop_zext_optab (&optab_table[OTI_call_value_pop_zext])
+
/* Conversion optabs have their own table and indexes. */
enum convert_optab_index
{
Index: genopinit.c
===================================================================
--- genopinit.c (revision 158639)
+++ genopinit.c (working copy)
@@ -272,7 +272,11 @@ static const char * const optabs[] =
"optab_handler (vec_pack_ssat_optab, $A)->insn_code = CODE_FOR_$(vec_pack_ssat_$a$)",
"optab_handler (vec_pack_usat_optab, $A)->insn_code = CODE_FOR_$(vec_pack_usat_$a$)",
"optab_handler (vec_pack_sfix_trunc_optab, $A)->insn_code = CODE_FOR_$(vec_pack_sfix_trunc_$a$)",
- "optab_handler (vec_pack_ufix_trunc_optab, $A)->insn_code = CODE_FOR_$(vec_pack_ufix_trunc_$a$)"
+ "optab_handler (vec_pack_ufix_trunc_optab, $A)->insn_code = CODE_FOR_$(vec_pack_ufix_trunc_$a$)",
+ "optab_handler (call_value_sext_optab, $A)->insn_code = CODE_FOR_$(call_value_sext$I$a$)",
+ "optab_handler (call_value_zext_optab, $A)->insn_code = CODE_FOR_$(call_value_zext$I$a$)",
+ "optab_handler (call_value_pop_sext_optab, $A)->insn_code = CODE_FOR_$(call_value_pop_sext$I$a$)",
+ "optab_handler (call_value_pop_zext_optab, $A)->insn_code = CODE_FOR_$(call_value_pop_zext$I$a$)"
};
static void gen_insn (rtx);
Index: final.c
===================================================================
--- final.c (revision 158639)
+++ final.c (working copy)
@@ -1791,6 +1791,10 @@ call_from_call_insn (rtx insn)
case PARALLEL:
x = XVECEXP (x, 0, 0);
break;
+ case ZERO_EXTEND:
+ case SIGN_EXTEND:
+ x = XEXP (x, 0);
+ break;
case SET:
x = XEXP (x, 1);
break;
Index: calls.c
===================================================================
--- calls.c (revision 158639)
+++ calls.c (working copy)
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.
#include "flags.h"
#include "expr.h"
#include "optabs.h"
+#include "recog.h"
#include "libfuncs.h"
#include "function.h"
#include "regs.h"
@@ -256,6 +257,8 @@ emit_call_1 (rtx funexp, tree fntree ATT
rtx call_insn;
int already_popped = 0;
HOST_WIDE_INT n_popped = RETURN_POPS_ARGS (fndecl, funtype, stack_size);
+ enum insn_code pop_icode = CODE_FOR_nothing;
+ enum insn_code icode = CODE_FOR_nothing;
#ifdef CALL_POPS_ARGS
n_popped += CALL_POPS_ARGS (* args_so_far);
@@ -267,6 +270,31 @@ emit_call_1 (rtx funexp, tree fntree ATT
if (GET_CODE (funexp) != SYMBOL_REF)
funexp = memory_address (FUNCTION_MODE, funexp);
+
+ if (valreg)
+ {
+ enum machine_mode retval_mode, promoted_mode;
+ int retval_unsignedp;
+
+ retval_unsignedp = TYPE_UNSIGNED (TREE_TYPE (funtype));
+ retval_mode = TYPE_MODE (TREE_TYPE (funtype));
+ promoted_mode
+ = promote_function_mode (TREE_TYPE (funtype), retval_mode,
+ &retval_unsignedp, funtype, 1);
+
+ if (promoted_mode != retval_mode)
+ {
+ optab call_optab = (retval_unsignedp
+ ? call_value_pop_zext_optab
+ : call_value_pop_sext_optab);
+ pop_icode = optab_handler (call_optab, retval_mode)->insn_code;
+ call_optab = (retval_unsignedp
+ ? call_value_zext_optab
+ : call_value_sext_optab);
+ icode = optab_handler (call_optab, retval_mode)->insn_code;
+ }
+ }
+
#if defined (HAVE_sibcall_pop) && defined (HAVE_sibcall_value_pop)
if ((ecf_flags & ECF_SIBCALL)
&& HAVE_sibcall_pop && HAVE_sibcall_value_pop
@@ -311,7 +339,11 @@ emit_call_1 (rtx funexp, tree fntree ATT
/* If this subroutine pops its own args, record that in the call insn
if possible, for the sake of frame pointer elimination. */
- if (valreg)
+ if (valreg && pop_icode != CODE_FOR_nothing)
+ pat = GEN_FCN (pop_icode) (valreg,
+ gen_rtx_MEM (FUNCTION_MODE, funexp),
+ rounded_stack_size_rtx, next_arg_reg, n_pop);
+ else if (valreg)
pat = GEN_CALL_VALUE_POP (valreg,
gen_rtx_MEM (FUNCTION_MODE, funexp),
rounded_stack_size_rtx, next_arg_reg, n_pop);
@@ -345,7 +377,12 @@ emit_call_1 (rtx funexp, tree fntree ATT
#if defined (HAVE_call) && defined (HAVE_call_value)
if (HAVE_call && HAVE_call_value)
{
- if (valreg)
+ if (valreg && icode != CODE_FOR_nothing)
+ emit_call_insn (GEN_FCN (icode) (valreg,
+ gen_rtx_MEM (FUNCTION_MODE, funexp),
+ rounded_stack_size_rtx, next_arg_reg,
+ NULL_RTX));
+ else if (valreg)
emit_call_insn (GEN_CALL_VALUE (valreg,
gen_rtx_MEM (FUNCTION_MODE, funexp),
rounded_stack_size_rtx, next_arg_reg,
Index: combine.c
===================================================================
--- combine.c (revision 158639)
+++ combine.c (working copy)
@@ -11889,15 +11889,8 @@ record_dead_and_set_regs (rtx insn)
}
last_call_luid = mem_last_set = DF_INSN_LUID (insn);
-
- /* We can't combine into a call pattern. Remember, though, that
- the return value register is set at this LUID. We could
- still replace a register with the return value from the
- wrong subroutine call! */
- note_stores (PATTERN (insn), record_dead_and_set_regs_1, NULL_RTX);
}
- else
- note_stores (PATTERN (insn), record_dead_and_set_regs_1, insn);
+ note_stores (PATTERN (insn), record_dead_and_set_regs_1, insn);
}
/* If a SUBREG has the promoted bit set, it is in fact a property of the
Index: Makefile.in
===================================================================
--- Makefile.in (revision 158639)
+++ Makefile.in (working copy)
@@ -2807,7 +2807,7 @@ builtins.o : builtins.c $(CONFIG_H) $(SY
libfuncs.h $(REAL_H) langhooks.h $(BASIC_BLOCK_H) tree-mudflap.h \
$(BUILTINS_DEF) $(MACHMODE_H) $(DIAGNOSTIC_H) $(TREE_FLOW_H) value-prof.h
calls.o : calls.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
- $(TREE_H) $(FLAGS_H) $(EXPR_H) $(OPTABS_H) langhooks.h $(TARGET_H) \
+ $(TREE_H) $(FLAGS_H) $(EXPR_H) $(OPTABS_H) $(RECOG_H) langhooks.h $(TARGET_H) \
libfuncs.h $(REGS_H) $(TOPLEV_H) output.h $(FUNCTION_H) $(TIMEVAR_H) $(TM_P_H) \
$(CGRAPH_H) $(EXCEPT_H) sbitmap.h $(DBGCNT_H) $(TREE_FLOW_H)
expmed.o : expmed.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
Index: config/arm/arm.md
===================================================================
--- config/arm/arm.md (revision 158639)
+++ config/arm/arm.md (working copy)
@@ -8561,6 +8561,51 @@ (define_expand "call_value"
}"
)
+(define_code_iterator cext [sign_extend zero_extend])
+(define_code_attr optab [(sign_extend "sext")
+ (zero_extend "zext")])
+
+(define_mode_iterator CEXT [QI HI])
+
+(define_expand "call_value_<optab><mode>"
+ [(parallel [(set (match_operand:SI 0 "" "")
+ (cext:SI
+ (call:CEXT (match_operand 1 "memory_operand" "")
+ (match_operand 2 "general_operand" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))])]
+ "TARGET_EITHER"
+ "
+ {
+ rtx pat, callee;
+
+ /* In an untyped call, we can get NULL for operand 2. */
+ if (operands[3] == 0)
+ operands[3] = const0_rtx;
+
+ /* Decide if we should generate indirect calls by loading the
+ 32-bit address of the callee into a register before performing the
+ branch and link. */
+ callee = XEXP (operands[1], 0);
+ if (GET_CODE (callee) == SYMBOL_REF
+ ? arm_is_long_call_p (SYMBOL_REF_DECL (callee))
+ : !REG_P (callee))
+ XEXP (operands[1], 0) = force_reg (Pmode, callee);
+
+ pat = gen_call_value_<optab><mode>_internal (operands[0], operands[1],
+ operands[2], operands[3]);
+ arm_emit_call_insn (pat, XEXP (operands[1], 0));
+ DONE;
+ }"
+)
+
+(define_expand "call_value_<optab><mode>_internal"
+ [(parallel [(set (match_operand 0 "" "")
+ (cext:SI (call:CEXT (match_operand 1 "memory_operand" "")
+ (match_operand 2 "general_operand" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))])])
+
(define_expand "call_value_internal"
[(parallel [(set (match_operand 0 "" "")
(call (match_operand 1 "memory_operand" "")
@@ -8642,6 +8687,80 @@ (define_insn "*call_value_reg_thumb1"
[(set_attr "type" "call")]
)
+(define_insn "*call_value_<optab><mode>_reg_armv5"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand:SI 1 "s_register_operand" "r"))
+ (match_operand 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_ARM && arm_arch5"
+ "blx%?\\t%1"
+ [(set_attr "type" "call")]
+)
+
+(define_insn "*call_value_<optab><mode>_reg_arm"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand:SI 1 "s_register_operand" "r"))
+ (match_operand 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_ARM && !arm_arch5"
+ "*
+ return output_call (&operands[1]);
+ "
+ [(set_attr "length" "12")
+ (set_attr "type" "call")]
+)
+
+;; Note: see *call_mem
+
+(define_insn "*call_value_<optab><mode>_mem"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand:SI 1 "call_memory_operand" "m"))
+ (match_operand 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_ARM && !arm_arch5 && (!CONSTANT_ADDRESS_P (XEXP (operands[1], 0)))"
+ "*
+ return output_call_mem (&operands[1]);
+ "
+ [(set_attr "length" "12")
+ (set_attr "type" "call")]
+)
+
+(define_insn "*call_value_<optab><mode>_reg_thumb1_v5"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand:SI 1 "register_operand" "l*r"))
+ (match_operand 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_THUMB1 && arm_arch5"
+ "blx\\t%1"
+ [(set_attr "length" "2")
+ (set_attr "type" "call")]
+)
+
+(define_insn "*call_value_<optab><mode>_reg_thumb1"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand:SI 1 "register_operand" "l*r"))
+ (match_operand 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_THUMB1 && !arm_arch5"
+ "*
+ {
+ if (!TARGET_CALLER_INTERWORKING)
+ return thumb_call_via_reg (operands[1]);
+ else if (operands[2] == const0_rtx)
+ return \"bl\\t%__interwork_call_via_%1\";
+ else if (frame_pointer_needed)
+ return \"bl\\t%__interwork_r7_call_via_%1\";
+ else
+ return \"bl\\t%__interwork_r11_call_via_%1\";
+ }"
+ [(set_attr "type" "call")]
+)
+
;; Allow calls to SYMBOL_REFs specially as they are not valid general addresses
;; The 'a' causes the operand to be treated as an address, i.e. no '#' output.
@@ -8676,6 +8795,22 @@ (define_insn "*call_value_symbol"
[(set_attr "type" "call")]
)
+(define_insn "*call_value_<optab><mode>_symbol"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand:SI 1 "" ""))
+ (match_operand:SI 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_ARM
+ && (GET_CODE (operands[1]) == SYMBOL_REF)
+ && !arm_is_long_call_p (SYMBOL_REF_DECL (operands[1]))"
+ "*
+ {
+ return NEED_PLT_RELOC ? \"bl%?\\t%a1(PLT)\" : \"bl%?\\t%a1\";
+ }"
+ [(set_attr "type" "call")]
+)
+
(define_insn "*call_insn"
[(call (mem:SI (match_operand:SI 0 "" ""))
(match_operand:SI 1 "" ""))
@@ -8703,6 +8838,20 @@ (define_insn "*call_value_insn"
(set_attr "type" "call")]
)
+(define_insn "*call_value_<optab><mode>_insn"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand 1 "" ""))
+ (match_operand 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_THUMB
+ && GET_CODE (operands[1]) == SYMBOL_REF
+ && !arm_is_long_call_p (SYMBOL_REF_DECL (operands[1]))"
+ "bl\\t%a1"
+ [(set_attr "length" "4")
+ (set_attr "type" "call")]
+)
+
;; We may also be able to do sibcalls for Thumb, but it's much harder...
(define_expand "sibcall"
[(parallel [(call (match_operand 0 "memory_operand" "")
Index: config/arm/thumb2.md
===================================================================
--- config/arm/thumb2.md (revision 158639)
+++ config/arm/thumb2.md (working copy)
@@ -448,6 +448,17 @@ (define_insn "*call_value_reg_thumb2"
[(set_attr "type" "call")]
)
+(define_insn "*call_value_<optab><mode>_reg_thumb2"
+ [(set (match_operand:SI 0 "" "")
+ (cext:SI (call:CEXT (mem:SI (match_operand:SI 1 "register_operand" "l*r"))
+ (match_operand 2 "" ""))))
+ (use (match_operand 3 "" ""))
+ (clobber (reg:SI LR_REGNUM))]
+ "TARGET_THUMB2"
+ "blx\\t%1"
+ [(set_attr "type" "call")]
+)
+
(define_insn "*thumb2_indirect_jump"
[(set (pc)
(match_operand:SI 0 "register_operand" "l*r"))]
Index: testsuite/gcc.target/arm/pr40900.c
===================================================================
--- testsuite/gcc.target/arm/pr40900.c (revision 0)
+++ testsuite/gcc.target/arm/pr40900.c (revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-optimize-sibling-calls" } */
+
+extern short shortv2();
+short shortv1()
+{
+ return shortv2();
+}
+
+/* { dg-final { scan-assembler-not "lsl" } } */
+/* { dg-final { scan-assembler-not "asr" } } */
+/* { dg-final { scan-assembler-not "sxth" } } */
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Patch: PR40900, extending call patterns
2010-04-30 1:05 Patch: PR40900, extending call patterns Bernd Schmidt
@ 2010-04-30 3:33 ` Andrew Pinski
2010-04-30 9:23 ` Paolo Bonzini
0 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2010-04-30 3:33 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
On Thu, Apr 29, 2010 at 5:32 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> The idea is to represent the sign-extension in the RTL call patterns,
> choosing extending variants in calls.c when we notice a suitable
> promotion. Then we let combine do its work; this needed another little
> tweak to make sure it recorded the extension of the return register.
On PowerPC (and MIPS even with 4.3.3), this is already optimized
without a sign extend.
(call_insn 5 4 6 3 t.c:4 (parallel [
(set (reg:SI 3 3)
(call (mem:SI (symbol_ref:SI ("shortv2") [flags 0x41]
<function_decl 0x7fc0bfb81a00 shortv2>) [0 S4 A8])
(const_int 0 [0x0])))
(use (const_int 0 [0x0]))
(clobber (reg:SI 65 lr))
]) -1 (nil)
(nil))
(insn 6 5 7 3 t.c:4 (set (reg:SI 121)
(reg:SI 3 3)) -1 (nil))
(insn 7 6 8 3 t.c:4 (set (reg:SI 119 [ D.2012+-2 ])
(reg:SI 121)) -1 (nil))
(insn 8 7 9 3 t.c:4 (set (reg:SI 120 [ <retval>+-2 ])
(reg:SI 119 [ D.2012+-2 ])) -1 (nil))
--- CUT ---
Maybe I am missing something but the arm RTL should look something
similar without a sign extension.
Thanks,
Andrew Pinski
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Patch: PR40900, extending call patterns
2010-04-30 3:33 ` Andrew Pinski
@ 2010-04-30 9:23 ` Paolo Bonzini
2010-04-30 15:09 ` Bernd Schmidt
0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2010-04-30 9:23 UTC (permalink / raw)
To: Andrew Pinski; +Cc: Bernd Schmidt, GCC Patches
[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]
On 04/30/2010 03:13 AM, Andrew Pinski wrote:
> On Thu, Apr 29, 2010 at 5:32 PM, Bernd Schmidt<bernds@codesourcery.com> wrote:
>> The idea is to represent the sign-extension in the RTL call patterns,
>> choosing extending variants in calls.c when we notice a suitable
>> promotion. Then we let combine do its work; this needed another little
>> tweak to make sure it recorded the extension of the return register.
>
> On PowerPC (and MIPS even with 4.3.3), this is already optimized
> without a sign extend.
>
> (call_insn 5 4 6 3 t.c:4 (parallel [
> (set (reg:SI 3 3)
> (call (mem:SI (symbol_ref:SI ("shortv2") [flags 0x41]
> <function_decl 0x7fc0bfb81a00 shortv2>) [0 S4 A8])
> (const_int 0 [0x0])))
> (use (const_int 0 [0x0]))
> (clobber (reg:SI 65 lr))
> ]) -1 (nil)
> (nil))
>
> (insn 6 5 7 3 t.c:4 (set (reg:SI 121)
> (reg:SI 3 3)) -1 (nil))
>
> (insn 7 6 8 3 t.c:4 (set (reg:SI 119 [ D.2012+-2 ])
> (reg:SI 121)) -1 (nil))
>
> (insn 8 7 9 3 t.c:4 (set (reg:SI 120 [<retval>+-2 ])
> (reg:SI 119 [ D.2012+-2 ])) -1 (nil))
>
> --- CUT ---
>
> Maybe I am missing something but the arm RTL should look something
> similar without a sign extension.
I suspect that's due to a different definition of promote_prototypes and
other hooks.
However, I agree with Andrew that all the information should already be
there (through the promote_function_mode hook), it's just that combine
isn't trying hard enough. Maybe something based on the attached patch?
Paolo
[-- Attachment #2: pr40900-stub.patch --]
[-- Type: text/plain, Size: 2619 bytes --]
Index: combine.c
===================================================================
--- combine.c (revision 158698)
+++ combine.c (working copy)
@@ -427,7 +427,7 @@ static int recog_for_combine (rtx *, rtx
static rtx gen_lowpart_for_combine (enum machine_mode, rtx);
static enum rtx_code simplify_comparison (enum rtx_code, rtx *, rtx *);
static void update_table_tick (rtx);
-static void record_value_for_reg (rtx, rtx, rtx);
+static reg_stat_type *record_value_for_reg (rtx, rtx, rtx);
static void check_promoted_subreg (rtx, rtx);
static void record_dead_and_set_regs_1 (rtx, const_rtx, void *);
static void record_dead_and_set_regs (rtx);
@@ -11688,7 +11688,7 @@ update_table_tick (rtx x)
only permitted with VALUE also zero and is used to invalidate the
register. */
-static void
+static reg_stat_type *
record_value_for_reg (rtx reg, rtx insn, rtx value)
{
unsigned int regno = REGNO (reg);
@@ -11798,6 +11798,8 @@ record_value_for_reg (rtx reg, rtx insn,
rsp->last_set_sign_bit_copies
= num_sign_bit_copies (value, GET_MODE (reg));
}
+
+ return rsp;
}
/* Called via note_stores from record_dead_and_set_regs to handle one
@@ -11812,10 +11814,20 @@ record_dead_and_set_regs_1 (rtx dest, co
if (GET_CODE (dest) == SUBREG)
dest = SUBREG_REG (dest);
- if (!record_dead_insn)
+ /* We can't combine into a call pattern. Remember that
+ the return value register is set at this LUID. We could
+ still replace a register with the return value from the
+ wrong subroutine call! */
+ if (CALL_P (record_dead_insn))
{
if (REG_P (dest))
- record_value_for_reg (dest, NULL_RTX, NULL_RTX);
+ {
+ reg_stat_type *rsp = record_value_for_reg (dest, NULL_RTX, NULL_RTX);
+
+ /* TODO: set last_set_mode, last_set_nonzero_bits,
+ * last_set_sign_bit_copies depending on the
+ * promote_function_mode hook... */
+ }
return;
}
@@ -11898,15 +11910,9 @@ record_dead_and_set_regs (rtx insn)
}
last_call_luid = mem_last_set = DF_INSN_LUID (insn);
-
- /* We can't combine into a call pattern. Remember, though, that
- the return value register is set at this LUID. We could
- still replace a register with the return value from the
- wrong subroutine call! */
- note_stores (PATTERN (insn), record_dead_and_set_regs_1, NULL_RTX);
}
- else
- note_stores (PATTERN (insn), record_dead_and_set_regs_1, insn);
+
+ note_stores (PATTERN (insn), record_dead_and_set_regs_1, insn);
}
/* If a SUBREG has the promoted bit set, it is in fact a property of the
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Patch: PR40900, extending call patterns
2010-04-30 9:23 ` Paolo Bonzini
@ 2010-04-30 15:09 ` Bernd Schmidt
[not found] ` <201005010054.17350.ebotcazou@adacore.com>
0 siblings, 1 reply; 15+ messages in thread
From: Bernd Schmidt @ 2010-04-30 15:09 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Andrew Pinski, GCC Patches
[-- Attachment #1: Type: text/plain, Size: 749 bytes --]
On 04/30/2010 10:57 AM, Paolo Bonzini wrote:
> However, I agree with Andrew that all the information should already be
> there (through the promote_function_mode hook), it's just that combine
> isn't trying hard enough. Maybe something based on the attached patch?
When finished, it looks like the one below. Simply setting
sign_bit_copies in record_dead_and_set_regs_1 doesn't work, and rather
than duplicating the bookkeeping I construct a small piece of RTL to
pass to record_value_for_reg that makes it do the right thing.
This works without modifying backends, but it has the drawback that it
can't optimize indirect calls since we're lacking a decl for them.
Testing now in progress on i686-linux. Ok (this or the previous one)?
Bernd
[-- Attachment #2: pr40900-b.diff --]
[-- Type: text/plain, Size: 3625 bytes --]
PR rtl-optimization/40900
* combine.c (record_dead_and_set_regs_1): Handle calls with a MEM
argument by examining the decl and recording an extension if the
return value was promoted.
(record_dead_and_set_regs): Don't handle calls specially.
PR rtl-optimization/40900
* gcc.target/arm/pr40900.c: New test.
Index: combine.c
===================================================================
--- combine.c (revision 158639)
+++ combine.c (working copy)
@@ -11803,20 +11803,49 @@ record_dead_and_set_regs_1 (rtx dest, co
if (GET_CODE (dest) == SUBREG)
dest = SUBREG_REG (dest);
- if (!record_dead_insn)
- {
- if (REG_P (dest))
- record_value_for_reg (dest, NULL_RTX, NULL_RTX);
- return;
- }
-
if (REG_P (dest))
{
/* If we are setting the whole register, we know its value. Otherwise
show that we don't know the value. We can handle SUBREG in
some cases. */
if (GET_CODE (setter) == SET && dest == SET_DEST (setter))
- record_value_for_reg (dest, record_dead_insn, SET_SRC (setter));
+ {
+ if (CALL_P (record_dead_insn) && GET_CODE (SET_SRC (setter)) == CALL
+ && MEM_P (XEXP (SET_SRC (setter), 0))
+ && GET_CODE (XEXP (XEXP (SET_SRC (setter), 0), 0)) == SYMBOL_REF)
+ {
+ /* Try to detect promotions of the return value. */
+ tree decl = SYMBOL_REF_DECL (XEXP (XEXP (SET_SRC (setter), 0), 0));
+ if (decl)
+ {
+ tree fntype = TREE_TYPE (decl);
+ tree result_type = TREE_TYPE (fntype);
+ int uns = TYPE_UNSIGNED (result_type);
+ enum machine_mode mode_from, mode_to;
+ reg_stat_type *rsp;
+ rtx t;
+
+ uns = TYPE_UNSIGNED (result_type);
+ mode_from = TYPE_MODE (result_type);
+ mode_to = promote_function_mode (result_type, mode_from, &uns,
+ fntype, 1);
+
+ if (mode_to == GET_MODE (dest)
+ && GET_MODE_CLASS (mode_to) == MODE_INT
+ && mode_from != mode_to)
+ {
+ rtx t = gen_rtx_fmt_e (uns ? ZERO_EXTEND : SIGN_EXTEND,
+ mode_to,
+ gen_rtx_SCRATCH (mode_from));
+ record_value_for_reg (dest, record_dead_insn, t);
+ return;
+ }
+ }
+ record_value_for_reg (dest, NULL_RTX, NULL_RTX);
+ }
+ else
+ record_value_for_reg (dest, record_dead_insn, SET_SRC (setter));
+ }
else if (GET_CODE (setter) == SET
&& GET_CODE (SET_DEST (setter)) == SUBREG
&& SUBREG_REG (SET_DEST (setter)) == dest
@@ -11889,15 +11918,9 @@ record_dead_and_set_regs (rtx insn)
}
last_call_luid = mem_last_set = DF_INSN_LUID (insn);
-
- /* We can't combine into a call pattern. Remember, though, that
- the return value register is set at this LUID. We could
- still replace a register with the return value from the
- wrong subroutine call! */
- note_stores (PATTERN (insn), record_dead_and_set_regs_1, NULL_RTX);
}
- else
- note_stores (PATTERN (insn), record_dead_and_set_regs_1, insn);
+
+ note_stores (PATTERN (insn), record_dead_and_set_regs_1, insn);
}
/* If a SUBREG has the promoted bit set, it is in fact a property of the
Index: testsuite/gcc.target/arm/pr40900.c
===================================================================
--- testsuite/gcc.target/arm/pr40900.c (revision 0)
+++ testsuite/gcc.target/arm/pr40900.c (revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-optimize-sibling-calls" } */
+
+extern short shortv2();
+short shortv1()
+{
+ return shortv2();
+}
+
+/* { dg-final { scan-assembler-not "lsl" } } */
+/* { dg-final { scan-assembler-not "asr" } } */
+/* { dg-final { scan-assembler-not "sxth" } } */
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2010-06-18 16:39 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-30 1:05 Patch: PR40900, extending call patterns Bernd Schmidt
2010-04-30 3:33 ` Andrew Pinski
2010-04-30 9:23 ` Paolo Bonzini
2010-04-30 15:09 ` Bernd Schmidt
[not found] ` <201005010054.17350.ebotcazou@adacore.com>
2010-05-01 10:45 ` Paolo Bonzini
2010-05-03 9:39 ` Bernd Schmidt
2010-05-05 13:01 ` Eric Botcazou
2010-05-06 7:14 ` Paolo Bonzini
2010-05-06 7:36 ` Eric Botcazou
2010-06-08 21:56 ` Bernd Schmidt
2010-06-09 22:14 ` Eric Botcazou
2010-06-11 13:58 ` Bernd Schmidt
2010-06-16 21:23 ` Eric Botcazou
2010-06-18 7:31 ` Bernd Schmidt
2010-06-18 17:09 ` Eric Botcazou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).