* [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver
@ 2012-10-11 23:41 Hans-Peter Nilsson
2012-10-12 8:45 ` Eric Botcazou
0 siblings, 1 reply; 7+ messages in thread
From: Hans-Peter Nilsson @ 2012-10-11 23:41 UTC (permalink / raw)
To: gcc-patches; +Cc: chertykov, aesok, eric.weddington
The md.texi entry for nonlocal_goto_receiver says "A typical reason
why you might need this pattern is if some value, such as a pointer to
a global table, must be restored when the frame pointer is restored.
Note that a nonlocal goto only occurs within a unit-of-translation, so
a global table pointer that is shared by all functions of a given
module need not be restored". One use would be to restore a
hardware-register-value saved in the current frame; the frame where
__builtin_setjmp is called (i.e. not a global context). This is what
the MMIX port does, saving the register-stack-pointer-register for use
when unwinding the register stack. I can imagine other similar
register-restoring needs that require something saved in the current
frame, but current ports with that pattern (or setjmp_receiver) but
without a nonlocal_goto pattern (see the HAVE_nonlocal_goto condition
in the patch) don't. (The AVR port performs what appears to be a
cargo-cult song and dance; the bug below copied into the port, but
the port will not otherwise be affected by the bug or this patch, as it
has a nonlocal_goto pattern.)
But, in the builtins.c:expand_builtin_setjmp_receiver, the
frame-pointer is *clobbered* for a mysterious and fuddy reason:
/* This might change the hard frame pointer in ways that aren't
apparent to early optimization passes, so force a clobber. */
emit_clobber (hard_frame_pointer_rtx);
That comment might have been true eons ago, but these days clobbering
the frame-pointer means that its value is void and any restoring insns
emitted before the clobber are deleted *because* of the clobber. For
example, in built-in-setjmp.c at -O2. Before .191r.ud_dce (i.e. in
the .190r.init-regs dump):
(code_label/s 47 115 48 3 3 "" [2 uses])
(note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 49 48 168 3 (use (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
(nil))
(insn 168 49 51 3 (set (reg/f:DI 253 $253)
(plus:DI (reg/f:DI 253 $253)
(const_int 24 [0x18]))) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
(nil))
(insn 51 168 52 3 (clobber (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
(nil))
(insn 52 51 54 3 (parallel [
(unspec_volatile [
(reg/f:DI 253 $253)
] 1)
(clobber (scratch:DI))
(clobber (reg:DI 259 rJ))
]) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 63 {*nonlocal_goto_receiver_expanded}
(expr_list:REG_UNUSED (reg:DI 259 rJ)
(nil)))
(insn 54 52 136 3 (asm_input/v ("") (null):0) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
(nil))
After:
(code_label/s 47 115 48 3 3 "" [2 uses])
(note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 49 48 51 3 (use (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
(nil))
(insn 51 49 52 3 (clobber (reg/f:DI 253 $253)) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
(nil))
(insn 52 51 54 3 (parallel [
(unspec_volatile [
(reg/f:DI 253 $253)
] 1)
(clobber (scratch:DI))
(clobber (reg:DI 259 rJ))
]) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 63 {*nonlocal_goto_receiver_expanded}
(expr_list:REG_UNUSED (reg:DI 259 rJ)
(nil)))
(insn 54 52 136 3 (asm_input/v ("") (null):0) /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21 -1
(nil))
Note that insn 168 deleted, which seems a logical optimization. The
bug is to emit the clobber, not that the restoring insn is removed.
While grepping around for other emitters of nonlocal_goto_receiver I
noticed that builtins.c:expand_builtin_setjmp_receiver is identical to
stmt.c:expand_nl_goto_receiver save for two things: the frame-pointer
clobbering(!) and that expand_builtin_setjmp_receiver instead prefers to
emit setjmp_receiver. I don't see how the frame-pointer-clobbering
would be needed as part of emitting setjmp_receiver.
I suggest eliminating the bug and one copy of the apparently bug-prone
code. I kept the function in builtins.c for obvious reasons (if not
obvious, consider the name: expand *builtin* setjmp_receiver) with the
setjmp-ness expressed through the label parameter, which is non-NULL
for pre-existing calls. Note also the fixed clobber-comment,
obviously incorrect in the stmt.c almost-copy, and at least on the
wrong line in expand_builtin_setjmp_receiver.
Tested mmix-knuth-mmixware, x86_64-unknown-linux-gnu (for good measure)
and rl78-elf (a SJLJ target; fixed-up with a patch from the maintainer
for the current breakage in PR54882, but without fortran), no regressions.
This fixes the following FAILs for mmix-knuth-mmixware:
Running /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp ...
...
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O2
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -fomit-frame-pointer
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -fomit-frame-pointer -funroll-loops
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-function\
s
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -g
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -Os
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O2 -flto -fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
...
Running /tmp/mmiximp2/gcc/gcc/testsuite/gcc.dg/torture/stackalign/stackalign.exp ...
...
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -O2 execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -O3 -fomit-frame-pointer execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -O3 -fomit-frame-pointer -funroll-loops execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution t\
est
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -O3 -g execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -Os execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O2 execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O3 -fomit-frame-pointer execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O3 -fomit-frame-pointer -funroll-loops execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution t\
est
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O3 -g execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -Os execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -O2 execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -O3 -fomit-frame-pointer execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -O3 -fomit-frame-pointer -funroll-loops execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution t\
est
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -O3 -g execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -Os execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test
Ok to commit?
gcc:
* stmt.c (expand_nl_goto_receiver): Remove almost-copy of
expand_builtin_setjmp_receiver.
(expand_label): Adjust, call expand_builtin_setjmp_receiver
with NULL for the label parameter.
* builtins.c (expand_builtin_setjmp_receiver): Don't clobber
the frame-pointer. Adjust comments.
[HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
only if LABEL is non-NULL.
Index: gcc/stmt.c
===================================================================
--- gcc/stmt.c (revision 192353)
+++ gcc/stmt.c (working copy)
@@ -102,7 +102,6 @@ typedef struct case_node *case_node_ptr;
static int n_occurrences (int, const char *);
static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
-static void expand_nl_goto_receiver (void);
static bool check_operand_nalternatives (tree, tree);
static bool check_unique_operand_names (tree, tree, tree);
static char *resolve_operand_name_1 (char *, tree, tree, tree);
@@ -196,7 +195,7 @@ expand_label (tree label)
if (DECL_NONLOCAL (label))
{
- expand_nl_goto_receiver ();
+ expand_builtin_setjmp_receiver (NULL);
nonlocal_goto_handler_labels
= gen_rtx_EXPR_LIST (VOIDmode, label_r,
nonlocal_goto_handler_labels);
@@ -1552,77 +1551,6 @@ expand_return (tree retval)
}
}
-/* Emit code to restore vital registers at the beginning of a nonlocal goto
- handler. */
-static void
-expand_nl_goto_receiver (void)
-{
- rtx chain;
-
- /* Clobber the FP when we get here, so we have to make sure it's
- marked as used by this function. */
- emit_use (hard_frame_pointer_rtx);
-
- /* Mark the static chain as clobbered here so life information
- doesn't get messed up for it. */
- chain = targetm.calls.static_chain (current_function_decl, true);
- if (chain && REG_P (chain))
- emit_clobber (chain);
-
-#ifdef HAVE_nonlocal_goto
- if (! HAVE_nonlocal_goto)
-#endif
- /* First adjust our frame pointer to its actual value. It was
- previously set to the start of the virtual area corresponding to
- the stacked variables when we branched here and now needs to be
- adjusted to the actual hardware fp value.
-
- Assignments are to virtual registers are converted by
- instantiate_virtual_regs into the corresponding assignment
- to the underlying register (fp in this case) that makes
- the original assignment true.
- So the following insn will actually be
- decrementing fp by STARTING_FRAME_OFFSET. */
- emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
-
-#if !HARD_FRAME_POINTER_IS_ARG_POINTER
- if (fixed_regs[ARG_POINTER_REGNUM])
- {
-#ifdef ELIMINABLE_REGS
- /* If the argument pointer can be eliminated in favor of the
- frame pointer, we don't need to restore it. We assume here
- that if such an elimination is present, it can always be used.
- This is the case on all known machines; if we don't make this
- assumption, we do unnecessary saving on many machines. */
- static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
- size_t i;
-
- for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
- if (elim_regs[i].from == ARG_POINTER_REGNUM
- && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
- break;
-
- if (i == ARRAY_SIZE (elim_regs))
-#endif
- {
- /* Now restore our arg pointer from the address at which it
- was saved in our stack frame. */
- emit_move_insn (crtl->args.internal_arg_pointer,
- copy_to_reg (get_arg_pointer_save_area ()));
- }
- }
-#endif
-
-#ifdef HAVE_nonlocal_goto_receiver
- if (HAVE_nonlocal_goto_receiver)
- emit_insn (gen_nonlocal_goto_receiver ());
-#endif
-
- /* We must not allow the code we just generated to be reordered by
- scheduling. Specifically, the update of the frame pointer must
- happen immediately, not later. */
- emit_insn (gen_blockage ());
-}
/* Emit code to save the current value of stack. */
rtx
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c (revision 192353)
+++ gcc/builtins.c (working copy)
@@ -885,14 +885,16 @@ expand_builtin_setjmp_setup (rtx buf_add
}
/* Construct the trailing part of a __builtin_setjmp call. This is
- also called directly by the SJLJ exception handling code. */
+ also called directly by the SJLJ exception handling code.
+ If RECEIVER_LABEL is NULL, instead the port-specific parts of a
+ nonlocal goto handler are emitted. */
void
expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
{
rtx chain;
- /* Clobber the FP when we get here, so we have to make sure it's
+ /* Mark the FP as used when we get here, so we have to make sure it's
marked as used by this function. */
emit_use (hard_frame_pointer_rtx);
@@ -907,12 +909,7 @@ expand_builtin_setjmp_receiver (rtx rece
#ifdef HAVE_nonlocal_goto
if (! HAVE_nonlocal_goto)
#endif
- {
- emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
- /* This might change the hard frame pointer in ways that aren't
- apparent to early optimization passes, so force a clobber. */
- emit_clobber (hard_frame_pointer_rtx);
- }
+ emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
#if !HARD_FRAME_POINTER_IS_ARG_POINTER
if (fixed_regs[ARG_POINTER_REGNUM])
@@ -938,7 +935,7 @@ expand_builtin_setjmp_receiver (rtx rece
#endif
#ifdef HAVE_builtin_setjmp_receiver
- if (HAVE_builtin_setjmp_receiver)
+ if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
emit_insn (gen_builtin_setjmp_receiver (receiver_label));
else
#endif
Oh, I almost forgot, there's a port-bug as well; the MMIX
nonlocal_goto_receiver need, in the pattern, naturally has to *refer*
to the frame-pointer for that to stay around. In built-in-setjmp.c
there's otherwise no other need: after __builtin_setjmp returns
non-zero (well, always 1), the function exits only through calls to
noreturn functions and there's no other frame-local data references
after the __builtin_setjmp call (the variable "p" is in a register).
I'll commit this together with the patch above if approved. Without
that, it has no effect on test-results. RTL dumps above are obviously
with this patch applied. :)
* config/mmix/mmix.md ("nonlocal_goto_receiver"): Refer to the
frame-pointer as an operand.
("*nonlocal_goto_receiver_expanded"): Ditto. Use
mmix_output_register_setting instead of naked output_asm_insn for
the offset from the frame-pointer to the saved rO.
* config/mmix/mmix.c (mmix_output_register_setting): Emit NEGU for
values -255..0.
* config/mmix/predicates.md ("frame_pointer_operand"): New.
* config/mmix/constraints.md ("Yf"): New.
--- gcc/config/mmix/mmix.md.orig 2012-09-10 01:41:59.000000000 +0200
+++ gcc/config/mmix/mmix.md 2012-10-09 01:54:43.000000000 +0200
@@ -1120,7 +1120,7 @@ DIVU %1,%1,%2\;GET %0,:rR\;NEGU %2,0,%0\
;; of "pop 0,0" until rO equals the saved value. (If it goes lower, we
;; should die with a trap.)
(define_expand "nonlocal_goto_receiver"
- [(parallel [(unspec_volatile [(const_int 0)] 1)
+ [(parallel [(unspec_volatile [(match_dup 1)] 1)
(clobber (scratch:DI))
(clobber (reg:DI MMIX_rJ_REGNUM))])
(set (reg:DI MMIX_rJ_REGNUM) (match_dup 0))]
@@ -1131,6 +1131,13 @@ DIVU %1,%1,%2\;GET %0,:rR\;NEGU %2,0,%0\
= mmix_get_hard_reg_initial_val (Pmode,
MMIX_INCOMING_RETURN_ADDRESS_REGNUM);
+ /* We need the frame-pointer to be live or the equivalent
+ expression, so refer to in in the pattern. We can't use a MEM
+ (that may contain out-of-range offsets in the final expression)
+ for fear that middle-end will legitimize it or replace the address
+ using temporary registers (which are not revived at this point). */
+ operands[1] = frame_pointer_rtx;
+
/* Mark this function as containing a landing-pad. */
cfun->machine->has_landing_pad = 1;
}")
@@ -1140,45 +1147,40 @@ DIVU %1,%1,%2\;GET %0,:rR\;NEGU %2,0,%0\
;; address and re-use them after the register stack unwind, so it's best
;; to form the address ourselves.
(define_insn "*nonlocal_goto_receiver_expanded"
- [(unspec_volatile [(const_int 0)] 1)
+ [(unspec_volatile [(match_operand:DI 1 "frame_pointer_operand" "Yf")] 1)
(clobber (match_scratch:DI 0 "=&r"))
(clobber (reg:DI MMIX_rJ_REGNUM))]
""
{
- rtx temp_reg = operands[0];
- rtx my_operands[2];
- HOST_WIDEST_INT offs;
+ rtx my_operands[3];
const char *my_template
= "GETA $255,0f\;PUT rJ,$255\;LDOU $255,%a0\n\
0:\;GET %1,rO\;CMPU %1,%1,$255\;BNP %1,1f\;POP 0,0\n1:";
- my_operands[1] = temp_reg;
+ my_operands[1] = operands[0];
+ my_operands[2] = GEN_INT (-MMIX_fp_rO_OFFSET);
- /* If we have a frame-pointer (hence unknown stack-pointer offset),
- just use the frame-pointer and the known offset. */
- if (frame_pointer_needed)
+ if (operands[1] == hard_frame_pointer_rtx)
{
- my_operands[0] = GEN_INT (-MMIX_fp_rO_OFFSET);
-
- output_asm_insn ("NEGU %1,0,%0", my_operands);
- my_operands[0] = gen_rtx_PLUS (Pmode, frame_pointer_rtx, temp_reg);
+ mmix_output_register_setting (asm_out_file, REGNO (operands[0]),
+ MMIX_fp_rO_OFFSET, 1);
+ my_operands[0]
+ = gen_rtx_PLUS (Pmode, hard_frame_pointer_rtx, operands[0]);
}
else
{
- /* We know the fp-based offset, so "eliminate" it to be sp-based. */
- offs
- = (mmix_initial_elimination_offset (MMIX_FRAME_POINTER_REGNUM,
- MMIX_STACK_POINTER_REGNUM)
- + MMIX_fp_rO_OFFSET);
+ HOST_WIDEST_INT offs = INTVAL (XEXP (operands[1], 1));
+ offs += MMIX_fp_rO_OFFSET;
- if (offs >= 0 && offs <= 255)
+ if (insn_const_int_ok_for_constraint (offs, CONSTRAINT_I))
my_operands[0]
= gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offs));
else
{
- mmix_output_register_setting (asm_out_file, REGNO (temp_reg),
+ mmix_output_register_setting (asm_out_file, REGNO (operands[0]),
offs, 1);
- my_operands[0] = gen_rtx_PLUS (Pmode, stack_pointer_rtx, temp_reg);
+ my_operands[0]
+ = gen_rtx_PLUS (Pmode, stack_pointer_rtx, operands[0]);
}
}
--- gcc/config/mmix/mmix.c.orig2 2012-10-07 05:11:52.000000000 +0200
+++ gcc/config/mmix/mmix.c 2012-10-09 01:56:17.000000000 +0200
@@ -2281,7 +2281,9 @@ mmix_output_register_setting (FILE *stre
if (do_begin_end)
fprintf (stream, "\t");
- if (mmix_shiftable_wyde_value ((unsigned HOST_WIDEST_INT) value))
+ if (insn_const_int_ok_for_constraint (value, CONSTRAINT_K))
+ fprintf (stream, "NEGU %s,0," HOST_WIDEST_INT_PRINT_DEC, reg_names[regno], -value);
+ else if (mmix_shiftable_wyde_value ((unsigned HOST_WIDEST_INT) value))
{
/* First, the one-insn cases. */
mmix_output_shiftvalue_op_from_str (stream, "SET",
--- gcc/config/mmix/predicates.md.orig 2012-09-10 01:41:59.000000000 +0200
+++ gcc/config/mmix/predicates.md 2012-10-08 20:10:08.000000000 +0200
@@ -161,3 +161,14 @@
(if_then_else (match_test "reload_in_progress || reload_completed")
(match_test "strict_memory_address_p (Pmode, op)")
(match_test "memory_address_p (Pmode, op)")))
+
+(define_predicate "frame_pointer_operand"
+ (ior
+ (and
+ (match_code "reg")
+ (match_test "op == hard_frame_pointer_rtx || op == frame_pointer_rtx"))
+ (and
+ (match_code "plus")
+ (match_code "reg" "0")
+ (match_code "const_int" "1")
+ (match_test "XEXP (op, 0) == stack_pointer_rtx"))))
--- gcc/config/mmix/constraints.md.orig 2012-09-10 01:41:59.000000000 +0200
+++ gcc/config/mmix/constraints.md 2012-10-08 09:07:48.000000000 +0200
@@ -110,3 +110,7 @@
(define_address_constraint "U"
"@internal"
(match_operand 0 "mmix_address_operand"))
+
+(define_constraint "Yf"
+ "@internal"
+ (match_operand 0 "frame_pointer_operand"))
brgds, H-P
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver
2012-10-11 23:41 [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver Hans-Peter Nilsson
@ 2012-10-12 8:45 ` Eric Botcazou
2012-10-16 0:47 ` Hans-Peter Nilsson
0 siblings, 1 reply; 7+ messages in thread
From: Eric Botcazou @ 2012-10-12 8:45 UTC (permalink / raw)
To: Hans-Peter Nilsson; +Cc: gcc-patches, chertykov, aesok, eric.weddington
> But, in the builtins.c:expand_builtin_setjmp_receiver, the
> frame-pointer is *clobbered* for a mysterious and fuddy reason:
>
> /* This might change the hard frame pointer in ways that aren't
> apparent to early optimization passes, so force a clobber. */
> emit_clobber (hard_frame_pointer_rtx);
>
> That comment might have been true eons ago, but these days clobbering
> the frame-pointer means that its value is void and any restoring insns
> emitted before the clobber are deleted *because* of the clobber. For
> example, in built-in-setjmp.c at -O2. Before .191r.ud_dce (i.e. in
> the .190r.init-regs dump):
>
> (code_label/s 47 115 48 3 3 "" [2 uses])
> (note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 49 48 168 3 (use (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 168 49 51 3 (set (reg/f:DI 253 $253)
> (plus:DI (reg/f:DI 253 $253)
> (const_int 24 [0x18])))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 52 51 54 3 (parallel [
> (unspec_volatile [
> (reg/f:DI 253 $253)
> ] 1)
> (clobber (scratch:DI))
> (clobber (reg:DI 259 rJ))
> ])
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> 63 {*nonlocal_goto_receiver_expanded} (expr_list:REG_UNUSED (reg:DI 259 rJ)
> (nil)))
> (insn 54 52 136 3 (asm_input/v ("") (null):0)
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
>
> After:
>
> (code_label/s 47 115 48 3 3 "" [2 uses])
> (note 48 47 49 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 49 48 51 3 (use (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 51 49 52 3 (clobber (reg/f:DI 253 $253))
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
> (insn 52 51 54 3 (parallel [
> (unspec_volatile [
> (reg/f:DI 253 $253)
> ] 1)
> (clobber (scratch:DI))
> (clobber (reg:DI 259 rJ))
> ])
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> 63 {*nonlocal_goto_receiver_expanded} (expr_list:REG_UNUSED (reg:DI 259 rJ)
> (nil)))
> (insn 54 52 136 3 (asm_input/v ("") (null):0)
> /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> -1 (nil))
>
> Note that insn 168 deleted, which seems a logical optimization. The
> bug is to emit the clobber, not that the restoring insn is removed.
Had that worked in the past for MMIX? If so, what changed recently?
> While grepping around for other emitters of nonlocal_goto_receiver I
> noticed that builtins.c:expand_builtin_setjmp_receiver is identical to
> stmt.c:expand_nl_goto_receiver save for two things: the frame-pointer
> clobbering(!) and that expand_builtin_setjmp_receiver instead prefers to
> emit setjmp_receiver. I don't see how the frame-pointer-clobbering
> would be needed as part of emitting setjmp_receiver.
Indeed, the discrepancy seems to have no firm basis.
> I suggest eliminating the bug and one copy of the apparently bug-prone
> code. I kept the function in builtins.c for obvious reasons (if not
> obvious, consider the name: expand *builtin* setjmp_receiver) with the
> setjmp-ness expressed through the label parameter, which is non-NULL
> for pre-existing calls. Note also the fixed clobber-comment,
> obviously incorrect in the stmt.c almost-copy, and at least on the
> wrong line in expand_builtin_setjmp_receiver.
Agreed. However, I'd suggest rescuing the comment for the ELIMINABLE_REGS
block from expand_nl_goto_receiver as it still sounds valid to me.
> * stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> expand_builtin_setjmp_receiver.
> (expand_label): Adjust, call expand_builtin_setjmp_receiver
> with NULL for the label parameter.
> * builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> the frame-pointer. Adjust comments.
> [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> only if LABEL is non-NULL.
I cannot formally approve, but this looks good to me modulo:
> Index: gcc/builtins.c
> ===================================================================
> --- gcc/builtins.c (revision 192353)
> +++ gcc/builtins.c (working copy)
> @@ -885,14 +885,16 @@ expand_builtin_setjmp_setup (rtx buf_add
> }
>
> /* Construct the trailing part of a __builtin_setjmp call. This is
> - also called directly by the SJLJ exception handling code. */
> + also called directly by the SJLJ exception handling code.
> + If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> + nonlocal goto handler are emitted. */
The "port-specific parts" wording is a bit confusing I think. I'd just write:
If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver
2012-10-12 8:45 ` Eric Botcazou
@ 2012-10-16 0:47 ` Hans-Peter Nilsson
2012-10-21 6:09 ` Ping: " Hans-Peter Nilsson
0 siblings, 1 reply; 7+ messages in thread
From: Hans-Peter Nilsson @ 2012-10-16 0:47 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches
On Fri, 12 Oct 2012, Eric Botcazou wrote:
> > (insn 168 49 51 3 (set (reg/f:DI 253 $253)
> > (plus:DI (reg/f:DI 253 $253)
> > (const_int 24 [0x18])))
> > /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> > -1 (nil))
> > (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
...
> > Note that insn 168 deleted, which seems a logical optimization. The
> > bug is to emit the clobber, not that the restoring insn is removed.
>
> Had that worked in the past for MMIX?
Yes, for svn revision 106027 (20051030) 4.1.0-era (!)
<http://gcc.gnu.org/ml/gcc-testresults/2005-10/msg01340.html>
where the test must have passed, as
gcc.c-torture/execute/built-in-setjmp.c is at least four years
older than that.
> If so, what changed recently?
By "these days" I didn't mean "recent", just not "eons ago". :)
I see in a gcc-test-results posting from Mike Stein (whom I'd
like to thank for test-results posting over the years), matching
FAILs for svn revision 126095 (20070628) 4.3.0-era
<http://gcc.gnu.org/ml/gcc-testresults/2007-06/msg01287.html>.
Sorry, I have nothing in between those reports, my bad. Though
I see no point narrowing down the failing revision further here
IMO; as mentioned the bug is not that the restoring insn is
removed.
> Agreed. However, I'd suggest rescuing the comment for the ELIMINABLE_REGS
> block from expand_nl_goto_receiver as it still sounds valid to me.
Oops, my bad; I see I removed all the good comments. Fixed.
> > * stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> > expand_builtin_setjmp_receiver.
> > (expand_label): Adjust, call expand_builtin_setjmp_receiver
> > with NULL for the label parameter.
> > * builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> > the frame-pointer. Adjust comments.
> > [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> > only if LABEL is non-NULL.
>
> I cannot formally approve, but this looks good to me modulo:
> > + If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> > + nonlocal goto handler are emitted. */
>
> The "port-specific parts" wording is a bit confusing I think. I'd just write:
>
> If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.
Sure. Thanks for the review. Updated patch below. As nothing
was changed from the previous post but comments as per the
review (mostly moving / reviving, fixing one grammo), already
covered by the changelog quoted above, the previous testing is
still valid.
Ok for trunk, approvers?
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c (revision 192353)
+++ gcc/builtins.c (working copy)
@@ -885,14 +885,15 @@ expand_builtin_setjmp_setup (rtx buf_add
}
/* Construct the trailing part of a __builtin_setjmp call. This is
- also called directly by the SJLJ exception handling code. */
+ also called directly by the SJLJ exception handling code.
+ If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler. */
void
expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
{
rtx chain;
- /* Clobber the FP when we get here, so we have to make sure it's
+ /* Mark the FP as used when we get here, so we have to make sure it's
marked as used by this function. */
emit_use (hard_frame_pointer_rtx);
@@ -907,17 +908,28 @@ expand_builtin_setjmp_receiver (rtx rece
#ifdef HAVE_nonlocal_goto
if (! HAVE_nonlocal_goto)
#endif
- {
- emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
- /* This might change the hard frame pointer in ways that aren't
- apparent to early optimization passes, so force a clobber. */
- emit_clobber (hard_frame_pointer_rtx);
- }
+ /* First adjust our frame pointer to its actual value. It was
+ previously set to the start of the virtual area corresponding to
+ the stacked variables when we branched here and now needs to be
+ adjusted to the actual hardware fp value.
+
+ Assignments to virtual registers are converted by
+ instantiate_virtual_regs into the corresponding assignment
+ to the underlying register (fp in this case) that makes
+ the original assignment true.
+ So the following insn will actually be decrementing fp by
+ STARTING_FRAME_OFFSET. */
+ emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
#if !HARD_FRAME_POINTER_IS_ARG_POINTER
if (fixed_regs[ARG_POINTER_REGNUM])
{
#ifdef ELIMINABLE_REGS
+ /* If the argument pointer can be eliminated in favor of the
+ frame pointer, we don't need to restore it. We assume here
+ that if such an elimination is present, it can always be used.
+ This is the case on all known machines; if we don't make this
+ assumption, we do unnecessary saving on many machines. */
size_t i;
static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
@@ -938,7 +950,7 @@ expand_builtin_setjmp_receiver (rtx rece
#endif
#ifdef HAVE_builtin_setjmp_receiver
- if (HAVE_builtin_setjmp_receiver)
+ if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
emit_insn (gen_builtin_setjmp_receiver (receiver_label));
else
#endif
Index: gcc/stmt.c
===================================================================
--- gcc/stmt.c (revision 192353)
+++ gcc/stmt.c (working copy)
@@ -102,7 +102,6 @@ typedef struct case_node *case_node_ptr;
static int n_occurrences (int, const char *);
static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
-static void expand_nl_goto_receiver (void);
static bool check_operand_nalternatives (tree, tree);
static bool check_unique_operand_names (tree, tree, tree);
static char *resolve_operand_name_1 (char *, tree, tree, tree);
@@ -196,7 +195,7 @@ expand_label (tree label)
if (DECL_NONLOCAL (label))
{
- expand_nl_goto_receiver ();
+ expand_builtin_setjmp_receiver (NULL);
nonlocal_goto_handler_labels
= gen_rtx_EXPR_LIST (VOIDmode, label_r,
nonlocal_goto_handler_labels);
@@ -1552,77 +1551,6 @@ expand_return (tree retval)
}
}
-/* Emit code to restore vital registers at the beginning of a nonlocal goto
- handler. */
-static void
-expand_nl_goto_receiver (void)
-{
- rtx chain;
-
- /* Clobber the FP when we get here, so we have to make sure it's
- marked as used by this function. */
- emit_use (hard_frame_pointer_rtx);
-
- /* Mark the static chain as clobbered here so life information
- doesn't get messed up for it. */
- chain = targetm.calls.static_chain (current_function_decl, true);
- if (chain && REG_P (chain))
- emit_clobber (chain);
-
-#ifdef HAVE_nonlocal_goto
- if (! HAVE_nonlocal_goto)
-#endif
- /* First adjust our frame pointer to its actual value. It was
- previously set to the start of the virtual area corresponding to
- the stacked variables when we branched here and now needs to be
- adjusted to the actual hardware fp value.
-
- Assignments are to virtual registers are converted by
- instantiate_virtual_regs into the corresponding assignment
- to the underlying register (fp in this case) that makes
- the original assignment true.
- So the following insn will actually be
- decrementing fp by STARTING_FRAME_OFFSET. */
- emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
-
-#if !HARD_FRAME_POINTER_IS_ARG_POINTER
- if (fixed_regs[ARG_POINTER_REGNUM])
- {
-#ifdef ELIMINABLE_REGS
- /* If the argument pointer can be eliminated in favor of the
- frame pointer, we don't need to restore it. We assume here
- that if such an elimination is present, it can always be used.
- This is the case on all known machines; if we don't make this
- assumption, we do unnecessary saving on many machines. */
- static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
- size_t i;
-
- for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
- if (elim_regs[i].from == ARG_POINTER_REGNUM
- && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
- break;
-
- if (i == ARRAY_SIZE (elim_regs))
-#endif
- {
- /* Now restore our arg pointer from the address at which it
- was saved in our stack frame. */
- emit_move_insn (crtl->args.internal_arg_pointer,
- copy_to_reg (get_arg_pointer_save_area ()));
- }
- }
-#endif
-
-#ifdef HAVE_nonlocal_goto_receiver
- if (HAVE_nonlocal_goto_receiver)
- emit_insn (gen_nonlocal_goto_receiver ());
-#endif
-
- /* We must not allow the code we just generated to be reordered by
- scheduling. Specifically, the update of the frame pointer must
- happen immediately, not later. */
- emit_insn (gen_blockage ());
-}
/* Emit code to save the current value of stack. */
rtx
brgds, H-P
^ permalink raw reply [flat|nested] 7+ messages in thread
* Ping: [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver
2012-10-16 0:47 ` Hans-Peter Nilsson
@ 2012-10-21 6:09 ` Hans-Peter Nilsson
2012-10-22 8:00 ` Richard Biener
0 siblings, 1 reply; 7+ messages in thread
From: Hans-Peter Nilsson @ 2012-10-21 6:09 UTC (permalink / raw)
To: gcc-patches
Cc: Jeffrey A Law, Roger Sayle, Ian Lance Taylor, dnovillo, rguenther
CC:ing middle-end maintainers this time. I was a bit surprised
when Eric Botcazou wrote in his review, quoted below, that he's
not one of you. Maybe approve that too?
On Mon, 15 Oct 2012, Hans-Peter Nilsson wrote:
> On Fri, 12 Oct 2012, Eric Botcazou wrote:
> > > (insn 168 49 51 3 (set (reg/f:DI 253 $253)
> > > (plus:DI (reg/f:DI 253 $253)
> > > (const_int 24 [0x18])))
> > > /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> > > -1 (nil))
> > > (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
> ...
>
> > > Note that insn 168 deleted, which seems a logical optimization. The
> > > bug is to emit the clobber, not that the restoring insn is removed.
> >
> > Had that worked in the past for MMIX?
>
> Yes, for svn revision 106027 (20051030) 4.1.0-era (!)
> <http://gcc.gnu.org/ml/gcc-testresults/2005-10/msg01340.html>
> where the test must have passed, as
> gcc.c-torture/execute/built-in-setjmp.c is at least four years
> older than that.
>
> > If so, what changed recently?
>
> By "these days" I didn't mean "recent", just not "eons ago". :)
> I see in a gcc-test-results posting from Mike Stein (whom I'd
> like to thank for test-results posting over the years), matching
> FAILs for svn revision 126095 (20070628) 4.3.0-era
> <http://gcc.gnu.org/ml/gcc-testresults/2007-06/msg01287.html>.
>
> Sorry, I have nothing in between those reports, my bad. Though
> I see no point narrowing down the failing revision further here
> IMO; as mentioned the bug is not that the restoring insn is
> removed.
>
> > Agreed. However, I'd suggest rescuing the comment for the ELIMINABLE_REGS
> > block from expand_nl_goto_receiver as it still sounds valid to me.
>
> Oops, my bad; I see I removed all the good comments. Fixed.
>
> > > * stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> > > expand_builtin_setjmp_receiver.
> > > (expand_label): Adjust, call expand_builtin_setjmp_receiver
> > > with NULL for the label parameter.
> > > * builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> > > the frame-pointer. Adjust comments.
> > > [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> > > only if LABEL is non-NULL.
> >
> > I cannot formally approve, but this looks good to me modulo:
>
> > > + If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> > > + nonlocal goto handler are emitted. */
> >
> > The "port-specific parts" wording is a bit confusing I think. I'd just write:
> >
> > If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.
>
> Sure. Thanks for the review. Updated patch below. As nothing
> was changed from the previous post but comments as per the
> review (mostly moving / reviving, fixing one grammo), already
> covered by the changelog quoted above, the previous testing is
> still valid.
>
> Ok for trunk, approvers?
>
> Index: gcc/builtins.c
> ===================================================================
> --- gcc/builtins.c (revision 192353)
> +++ gcc/builtins.c (working copy)
> @@ -885,14 +885,15 @@ expand_builtin_setjmp_setup (rtx buf_add
> }
>
> /* Construct the trailing part of a __builtin_setjmp call. This is
> - also called directly by the SJLJ exception handling code. */
> + also called directly by the SJLJ exception handling code.
> + If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler. */
>
> void
> expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
> {
> rtx chain;
>
> - /* Clobber the FP when we get here, so we have to make sure it's
> + /* Mark the FP as used when we get here, so we have to make sure it's
> marked as used by this function. */
> emit_use (hard_frame_pointer_rtx);
>
> @@ -907,17 +908,28 @@ expand_builtin_setjmp_receiver (rtx rece
> #ifdef HAVE_nonlocal_goto
> if (! HAVE_nonlocal_goto)
> #endif
> - {
> - emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> - /* This might change the hard frame pointer in ways that aren't
> - apparent to early optimization passes, so force a clobber. */
> - emit_clobber (hard_frame_pointer_rtx);
> - }
> + /* First adjust our frame pointer to its actual value. It was
> + previously set to the start of the virtual area corresponding to
> + the stacked variables when we branched here and now needs to be
> + adjusted to the actual hardware fp value.
> +
> + Assignments to virtual registers are converted by
> + instantiate_virtual_regs into the corresponding assignment
> + to the underlying register (fp in this case) that makes
> + the original assignment true.
> + So the following insn will actually be decrementing fp by
> + STARTING_FRAME_OFFSET. */
> + emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
>
> #if !HARD_FRAME_POINTER_IS_ARG_POINTER
> if (fixed_regs[ARG_POINTER_REGNUM])
> {
> #ifdef ELIMINABLE_REGS
> + /* If the argument pointer can be eliminated in favor of the
> + frame pointer, we don't need to restore it. We assume here
> + that if such an elimination is present, it can always be used.
> + This is the case on all known machines; if we don't make this
> + assumption, we do unnecessary saving on many machines. */
> size_t i;
> static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
>
> @@ -938,7 +950,7 @@ expand_builtin_setjmp_receiver (rtx rece
> #endif
>
> #ifdef HAVE_builtin_setjmp_receiver
> - if (HAVE_builtin_setjmp_receiver)
> + if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
> emit_insn (gen_builtin_setjmp_receiver (receiver_label));
> else
> #endif
> Index: gcc/stmt.c
> ===================================================================
> --- gcc/stmt.c (revision 192353)
> +++ gcc/stmt.c (working copy)
> @@ -102,7 +102,6 @@ typedef struct case_node *case_node_ptr;
>
> static int n_occurrences (int, const char *);
> static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
> -static void expand_nl_goto_receiver (void);
> static bool check_operand_nalternatives (tree, tree);
> static bool check_unique_operand_names (tree, tree, tree);
> static char *resolve_operand_name_1 (char *, tree, tree, tree);
> @@ -196,7 +195,7 @@ expand_label (tree label)
>
> if (DECL_NONLOCAL (label))
> {
> - expand_nl_goto_receiver ();
> + expand_builtin_setjmp_receiver (NULL);
> nonlocal_goto_handler_labels
> = gen_rtx_EXPR_LIST (VOIDmode, label_r,
> nonlocal_goto_handler_labels);
> @@ -1552,77 +1551,6 @@ expand_return (tree retval)
> }
> }
>
> -/* Emit code to restore vital registers at the beginning of a nonlocal goto
> - handler. */
> -static void
> -expand_nl_goto_receiver (void)
> -{
> - rtx chain;
> -
> - /* Clobber the FP when we get here, so we have to make sure it's
> - marked as used by this function. */
> - emit_use (hard_frame_pointer_rtx);
> -
> - /* Mark the static chain as clobbered here so life information
> - doesn't get messed up for it. */
> - chain = targetm.calls.static_chain (current_function_decl, true);
> - if (chain && REG_P (chain))
> - emit_clobber (chain);
> -
> -#ifdef HAVE_nonlocal_goto
> - if (! HAVE_nonlocal_goto)
> -#endif
> - /* First adjust our frame pointer to its actual value. It was
> - previously set to the start of the virtual area corresponding to
> - the stacked variables when we branched here and now needs to be
> - adjusted to the actual hardware fp value.
> -
> - Assignments are to virtual registers are converted by
> - instantiate_virtual_regs into the corresponding assignment
> - to the underlying register (fp in this case) that makes
> - the original assignment true.
> - So the following insn will actually be
> - decrementing fp by STARTING_FRAME_OFFSET. */
> - emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> -
> -#if !HARD_FRAME_POINTER_IS_ARG_POINTER
> - if (fixed_regs[ARG_POINTER_REGNUM])
> - {
> -#ifdef ELIMINABLE_REGS
> - /* If the argument pointer can be eliminated in favor of the
> - frame pointer, we don't need to restore it. We assume here
> - that if such an elimination is present, it can always be used.
> - This is the case on all known machines; if we don't make this
> - assumption, we do unnecessary saving on many machines. */
> - static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
> - size_t i;
> -
> - for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
> - if (elim_regs[i].from == ARG_POINTER_REGNUM
> - && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
> - break;
> -
> - if (i == ARRAY_SIZE (elim_regs))
> -#endif
> - {
> - /* Now restore our arg pointer from the address at which it
> - was saved in our stack frame. */
> - emit_move_insn (crtl->args.internal_arg_pointer,
> - copy_to_reg (get_arg_pointer_save_area ()));
> - }
> - }
> -#endif
> -
> -#ifdef HAVE_nonlocal_goto_receiver
> - if (HAVE_nonlocal_goto_receiver)
> - emit_insn (gen_nonlocal_goto_receiver ());
> -#endif
> -
> - /* We must not allow the code we just generated to be reordered by
> - scheduling. Specifically, the update of the frame pointer must
> - happen immediately, not later. */
> - emit_insn (gen_blockage ());
> -}
>
> /* Emit code to save the current value of stack. */
> rtx
> brgds, H-P
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ping: [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver
2012-10-21 6:09 ` Ping: " Hans-Peter Nilsson
@ 2012-10-22 8:00 ` Richard Biener
0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2012-10-22 8:00 UTC (permalink / raw)
To: Hans-Peter Nilsson
Cc: gcc-patches, Jeffrey A Law, Roger Sayle, Ian Lance Taylor, dnovillo
On Sun, 21 Oct 2012, Hans-Peter Nilsson wrote:
> CC:ing middle-end maintainers this time. I was a bit surprised
> when Eric Botcazou wrote in his review, quoted below, that he's
> not one of you. Maybe approve that too?
If Eric is fine with the patch it is ok. Yes, he is not
middle-end maintainer but RTL optimizer reviewer.
Thanks,
Richard.
> On Mon, 15 Oct 2012, Hans-Peter Nilsson wrote:
>
> > On Fri, 12 Oct 2012, Eric Botcazou wrote:
> > > > (insn 168 49 51 3 (set (reg/f:DI 253 $253)
> > > > (plus:DI (reg/f:DI 253 $253)
> > > > (const_int 24 [0x18])))
> > > > /tmp/mmiximp2/gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:21
> > > > -1 (nil))
> > > > (insn 51 168 52 3 (clobber (reg/f:DI 253 $253))
> > ...
> >
> > > > Note that insn 168 deleted, which seems a logical optimization. The
> > > > bug is to emit the clobber, not that the restoring insn is removed.
> > >
> > > Had that worked in the past for MMIX?
> >
> > Yes, for svn revision 106027 (20051030) 4.1.0-era (!)
> > <http://gcc.gnu.org/ml/gcc-testresults/2005-10/msg01340.html>
> > where the test must have passed, as
> > gcc.c-torture/execute/built-in-setjmp.c is at least four years
> > older than that.
> >
> > > If so, what changed recently?
> >
> > By "these days" I didn't mean "recent", just not "eons ago". :)
> > I see in a gcc-test-results posting from Mike Stein (whom I'd
> > like to thank for test-results posting over the years), matching
> > FAILs for svn revision 126095 (20070628) 4.3.0-era
> > <http://gcc.gnu.org/ml/gcc-testresults/2007-06/msg01287.html>.
> >
> > Sorry, I have nothing in between those reports, my bad. Though
> > I see no point narrowing down the failing revision further here
> > IMO; as mentioned the bug is not that the restoring insn is
> > removed.
> >
> > > Agreed. However, I'd suggest rescuing the comment for the ELIMINABLE_REGS
> > > block from expand_nl_goto_receiver as it still sounds valid to me.
> >
> > Oops, my bad; I see I removed all the good comments. Fixed.
> >
> > > > * stmt.c (expand_nl_goto_receiver): Remove almost-copy of
> > > > expand_builtin_setjmp_receiver.
> > > > (expand_label): Adjust, call expand_builtin_setjmp_receiver
> > > > with NULL for the label parameter.
> > > > * builtins.c (expand_builtin_setjmp_receiver): Don't clobber
> > > > the frame-pointer. Adjust comments.
> > > > [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
> > > > only if LABEL is non-NULL.
> > >
> > > I cannot formally approve, but this looks good to me modulo:
> >
> > > > + If RECEIVER_LABEL is NULL, instead the port-specific parts of a
> > > > + nonlocal goto handler are emitted. */
> > >
> > > The "port-specific parts" wording is a bit confusing I think. I'd just write:
> > >
> > > If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler.
> >
> > Sure. Thanks for the review. Updated patch below. As nothing
> > was changed from the previous post but comments as per the
> > review (mostly moving / reviving, fixing one grammo), already
> > covered by the changelog quoted above, the previous testing is
> > still valid.
> >
> > Ok for trunk, approvers?
> >
> > Index: gcc/builtins.c
> > ===================================================================
> > --- gcc/builtins.c (revision 192353)
> > +++ gcc/builtins.c (working copy)
> > @@ -885,14 +885,15 @@ expand_builtin_setjmp_setup (rtx buf_add
> > }
> >
> > /* Construct the trailing part of a __builtin_setjmp call. This is
> > - also called directly by the SJLJ exception handling code. */
> > + also called directly by the SJLJ exception handling code.
> > + If RECEIVER_LABEL is NULL, instead contruct a nonlocal goto handler. */
> >
> > void
> > expand_builtin_setjmp_receiver (rtx receiver_label ATTRIBUTE_UNUSED)
> > {
> > rtx chain;
> >
> > - /* Clobber the FP when we get here, so we have to make sure it's
> > + /* Mark the FP as used when we get here, so we have to make sure it's
> > marked as used by this function. */
> > emit_use (hard_frame_pointer_rtx);
> >
> > @@ -907,17 +908,28 @@ expand_builtin_setjmp_receiver (rtx rece
> > #ifdef HAVE_nonlocal_goto
> > if (! HAVE_nonlocal_goto)
> > #endif
> > - {
> > - emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> > - /* This might change the hard frame pointer in ways that aren't
> > - apparent to early optimization passes, so force a clobber. */
> > - emit_clobber (hard_frame_pointer_rtx);
> > - }
> > + /* First adjust our frame pointer to its actual value. It was
> > + previously set to the start of the virtual area corresponding to
> > + the stacked variables when we branched here and now needs to be
> > + adjusted to the actual hardware fp value.
> > +
> > + Assignments to virtual registers are converted by
> > + instantiate_virtual_regs into the corresponding assignment
> > + to the underlying register (fp in this case) that makes
> > + the original assignment true.
> > + So the following insn will actually be decrementing fp by
> > + STARTING_FRAME_OFFSET. */
> > + emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> >
> > #if !HARD_FRAME_POINTER_IS_ARG_POINTER
> > if (fixed_regs[ARG_POINTER_REGNUM])
> > {
> > #ifdef ELIMINABLE_REGS
> > + /* If the argument pointer can be eliminated in favor of the
> > + frame pointer, we don't need to restore it. We assume here
> > + that if such an elimination is present, it can always be used.
> > + This is the case on all known machines; if we don't make this
> > + assumption, we do unnecessary saving on many machines. */
> > size_t i;
> > static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
> >
> > @@ -938,7 +950,7 @@ expand_builtin_setjmp_receiver (rtx rece
> > #endif
> >
> > #ifdef HAVE_builtin_setjmp_receiver
> > - if (HAVE_builtin_setjmp_receiver)
> > + if (receiver_label != NULL && HAVE_builtin_setjmp_receiver)
> > emit_insn (gen_builtin_setjmp_receiver (receiver_label));
> > else
> > #endif
> > Index: gcc/stmt.c
> > ===================================================================
> > --- gcc/stmt.c (revision 192353)
> > +++ gcc/stmt.c (working copy)
> > @@ -102,7 +102,6 @@ typedef struct case_node *case_node_ptr;
> >
> > static int n_occurrences (int, const char *);
> > static bool tree_conflicts_with_clobbers_p (tree, HARD_REG_SET *);
> > -static void expand_nl_goto_receiver (void);
> > static bool check_operand_nalternatives (tree, tree);
> > static bool check_unique_operand_names (tree, tree, tree);
> > static char *resolve_operand_name_1 (char *, tree, tree, tree);
> > @@ -196,7 +195,7 @@ expand_label (tree label)
> >
> > if (DECL_NONLOCAL (label))
> > {
> > - expand_nl_goto_receiver ();
> > + expand_builtin_setjmp_receiver (NULL);
> > nonlocal_goto_handler_labels
> > = gen_rtx_EXPR_LIST (VOIDmode, label_r,
> > nonlocal_goto_handler_labels);
> > @@ -1552,77 +1551,6 @@ expand_return (tree retval)
> > }
> > }
> >
> > -/* Emit code to restore vital registers at the beginning of a nonlocal goto
> > - handler. */
> > -static void
> > -expand_nl_goto_receiver (void)
> > -{
> > - rtx chain;
> > -
> > - /* Clobber the FP when we get here, so we have to make sure it's
> > - marked as used by this function. */
> > - emit_use (hard_frame_pointer_rtx);
> > -
> > - /* Mark the static chain as clobbered here so life information
> > - doesn't get messed up for it. */
> > - chain = targetm.calls.static_chain (current_function_decl, true);
> > - if (chain && REG_P (chain))
> > - emit_clobber (chain);
> > -
> > -#ifdef HAVE_nonlocal_goto
> > - if (! HAVE_nonlocal_goto)
> > -#endif
> > - /* First adjust our frame pointer to its actual value. It was
> > - previously set to the start of the virtual area corresponding to
> > - the stacked variables when we branched here and now needs to be
> > - adjusted to the actual hardware fp value.
> > -
> > - Assignments are to virtual registers are converted by
> > - instantiate_virtual_regs into the corresponding assignment
> > - to the underlying register (fp in this case) that makes
> > - the original assignment true.
> > - So the following insn will actually be
> > - decrementing fp by STARTING_FRAME_OFFSET. */
> > - emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> > -
> > -#if !HARD_FRAME_POINTER_IS_ARG_POINTER
> > - if (fixed_regs[ARG_POINTER_REGNUM])
> > - {
> > -#ifdef ELIMINABLE_REGS
> > - /* If the argument pointer can be eliminated in favor of the
> > - frame pointer, we don't need to restore it. We assume here
> > - that if such an elimination is present, it can always be used.
> > - This is the case on all known machines; if we don't make this
> > - assumption, we do unnecessary saving on many machines. */
> > - static const struct elims {const int from, to;} elim_regs[] = ELIMINABLE_REGS;
> > - size_t i;
> > -
> > - for (i = 0; i < ARRAY_SIZE (elim_regs); i++)
> > - if (elim_regs[i].from == ARG_POINTER_REGNUM
> > - && elim_regs[i].to == HARD_FRAME_POINTER_REGNUM)
> > - break;
> > -
> > - if (i == ARRAY_SIZE (elim_regs))
> > -#endif
> > - {
> > - /* Now restore our arg pointer from the address at which it
> > - was saved in our stack frame. */
> > - emit_move_insn (crtl->args.internal_arg_pointer,
> > - copy_to_reg (get_arg_pointer_save_area ()));
> > - }
> > - }
> > -#endif
> > -
> > -#ifdef HAVE_nonlocal_goto_receiver
> > - if (HAVE_nonlocal_goto_receiver)
> > - emit_insn (gen_nonlocal_goto_receiver ());
> > -#endif
> > -
> > - /* We must not allow the code we just generated to be reordered by
> > - scheduling. Specifically, the update of the frame pointer must
> > - happen immediately, not later. */
> > - emit_insn (gen_blockage ());
> > -}
> >
> > /* Emit code to save the current value of stack. */
> > rtx
> > brgds, H-P
> >
>
>
--
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver
2012-10-22 22:50 Dominique Dhumieres
@ 2012-10-23 3:39 ` Hans-Peter Nilsson
0 siblings, 0 replies; 7+ messages in thread
From: Hans-Peter Nilsson @ 2012-10-23 3:39 UTC (permalink / raw)
To: Dominique Dhumieres; +Cc: gcc-patches
On Tue, 23 Oct 2012, Dominique Dhumieres wrote:
> This patch (r192676) is probably causing
>
> FAIL: gcc.c-torture/execute/builtins/memcpy-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/memmove-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/mempcpy-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/memset-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/stpcpy-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/stpncpy-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/strcat-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/strcpy-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/strncat-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/strncpy-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -Os
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -Os
>
> on i?86 (see http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg02350.html ).
Confirmed, now PR55030. I'll revert that patch pending further
investigation. Feel free to open PR's whenever something like
this happens.
brgds, H-P
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver
@ 2012-10-22 22:50 Dominique Dhumieres
2012-10-23 3:39 ` Hans-Peter Nilsson
0 siblings, 1 reply; 7+ messages in thread
From: Dominique Dhumieres @ 2012-10-22 22:50 UTC (permalink / raw)
To: gcc-patches; +Cc: hp
This patch (r192676) is probably causing
FAIL: gcc.c-torture/execute/builtins/memcpy-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/memmove-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/mempcpy-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/memset-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/stpcpy-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/stpncpy-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/strcat-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/strcpy-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/strncat-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/strncpy-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution, -Os
on i?86 (see http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg02350.html ).
TIA
Dominique
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-10-23 1:00 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-11 23:41 [RFA:] Fix frame-pointer-clobbering in builtins.c:expand_builtin_setjmp_receiver Hans-Peter Nilsson
2012-10-12 8:45 ` Eric Botcazou
2012-10-16 0:47 ` Hans-Peter Nilsson
2012-10-21 6:09 ` Ping: " Hans-Peter Nilsson
2012-10-22 8:00 ` Richard Biener
2012-10-22 22:50 Dominique Dhumieres
2012-10-23 3:39 ` Hans-Peter Nilsson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).