public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
@ 2012-11-12  4:49 Hans-Peter Nilsson
  2012-11-12  8:28 ` Eric Botcazou
  0 siblings, 1 reply; 19+ messages in thread
From: Hans-Peter Nilsson @ 2012-11-12  4:49 UTC (permalink / raw)
  To: gcc-patches

The problem exposed in PR55030 (repeatable on x86_64-linux with -m32 at
r192676) is that the fake-frame-pointer "frame" is replaced with the
actual-frame-pointer "bp" in cse1, around the critical insn in
__builtin_setjmp_receiver that restores their defined offset.  The
patch in PR55030/r192676 removed a clobber of the frame-pointer, and
cse1 felt free to replace the former register with the latter, as in
the following expansion of __builtin_setjmp_receiver (in builtins.c:
expand_builtin_setjmp_receiver) from a reduced test-case, see the PR.
(You might find the setup/restore code having an unexpected order;
there are two __builtin_setjmps in series and the setup of the second
one is storing the wrong value for the frame-pointer.)

The pre-transformation dump in cse1 says:

(code_label/s 13 51 14 3 2 "" [2 uses])
(note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
     (nil))
(insn 78 15 17 3 (set (reg/f:SI 20 frame)
        (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
     (nil))
(insn 17 78 56 3 (unspec_volatile [
            (const_int 0 [0])
        ] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage}
     (nil))
(insn 56 17 57 3 (set (reg/f:SI 74)
        (symbol_ref:SI ("chk_fail_buf") [flags 0x40]  <var_decl 0x7ffff7512688 chk_fail_buf>)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
     (nil))
(insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8])
        (reg/f:SI 20 frame)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
     (nil))

Before r192676 and after the reversion in r192701, there was/is that
weird extra "(clobber (reg/f:SI 6 bp)" inserted before insn 17.  But
without that, we notice post-cse1-transformation, a change in insn 57:

(code_label/s 13 51 14 3 2 "" [3 uses])
(note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
     (nil))
(insn 78 15 17 3 (set (reg/f:SI 20 frame)
        (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
     (nil))
(insn 17 78 56 3 (unspec_volatile [
            (const_int 0 [0])
        ] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage}
     (nil))
(insn 56 17 57 3 (set (reg/f:SI 74)
        (symbol_ref:SI ("chk_fail_buf") [flags 0x40]  <var_decl 0x7ffff7512688 chk_fail_buf>)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
     (nil))
(insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8])
        (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
     (nil))

It seemed wrong for this change to be stopped *only* by that (clobber bp).
Stepping through the code for insn 78 and 57, I was looking for
related special-casing of frame- and other similar registers in cse.c
and was a bit confused when I found none (not counting hash_rtx_cb).
(I admit lack of special-casing is a good sign, though. :)  Then I was
blinded by the obvious; the next insn, insn 17, is a "blockage" insn.

This is a target-specific blockage insn, but with the general form
found in all targets defining it.  The default blockage is an empty
asm-volatile, which is what cse_insn recognized.  The blockage insn is
there to "prevent scheduling" of the critical insns and register
values.  It's almost obvious that an unspec_volatile should have that
effect "too"; at least that's intended by the code in
expand_builtin_setjmp_receiver.  Luckily (or unluckily regarding the
presence of the bug) *this* cse code is not run post-frame-layout
(post-reload-cse does not use this code), or it seems people would
soon notice register values used from the wrong side of the blockage,
considering its critical use at the restore of the stack-pointer.
As mentioned in the previous patch,
<http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering
the frame-pointer (and then using it) does not seem the logical
alternative to the patch below; the blockage insn should just do its job.

I updated comments and documentation so it's more apparent that
register uses should also not be moved across the blockage; see the
patched comments.

Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after
192677.  Ok to commit?

gcc:
	PR middle-end/55030
	* builtins.c (expand_builtin_setjmp_receiver): Update comment
	regarding purpose of blockage.
	* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
	the head comment.
	* doc/md.texi (blockage): Update similarly.  Change wording to
	require one of two forms, rather than implying a wider choice.
	* cse.c (cse_insn): Where checking for blocking insns, treat
	UNSPEC_VOLATILE as blocking, besides volatile ASM.

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 192677)
+++ gcc/builtins.c	(working copy)
@@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece

   /* We must not allow the code we just generated to be reordered by
      scheduling.  Specifically, the update of the frame pointer must
-     happen immediately, not later.  */
+     happen immediately, not later.  Similarly, we must block
+     (frame-related) register values to be used across this code.  */
   emit_insn (gen_blockage ());
 }

Index: gcc/cse.c
===================================================================
--- gcc/cse.c	(revision 192677)
+++ gcc/cse.c	(working copy)
@@ -5660,10 +5660,11 @@ cse_insn (rtx insn)
 	  invalidate (XEXP (dest, 0), GET_MODE (dest));
       }

-  /* A volatile ASM invalidates everything.  */
+  /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
   if (NONJUMP_INSN_P (insn)
-      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
-      && MEM_VOLATILE_P (PATTERN (insn)))
+      && ((GET_CODE (PATTERN (insn)) == ASM_OPERANDS
+	   && MEM_VOLATILE_P (PATTERN (insn)))
+	  || GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE))
     flush_hash_table ();

   /* Don't cse over a call to setjmp; on some machines (eg VAX)
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c	(revision 192677)
+++ gcc/emit-rtl.c	(working copy)
@@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset)


 #if !HAVE_blockage
-/* Generate an empty ASM_INPUT, which is used to block attempts to schedule
-   across this insn. */
+/* Generate an empty ASM_INPUT, which is used to block attempts to schedule,
+   and to block register equivalences to be seen across this insn.  */

 rtx
 gen_blockage (void)
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 192677)
+++ gcc/doc/md.texi	(working copy)
@@ -5832,8 +5832,9 @@ the values of operands 1 and 2.
 @item @samp{blockage}

 This pattern defines a pseudo insn that prevents the instruction
-scheduler from moving instructions across the boundary defined by the
-blockage insn.  Normally an UNSPEC_VOLATILE pattern.
+scheduler and other passes from moving instructions and using register
+equivalences across the boundary defined by the blockage insn.
+This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.

 @cindex @code{memory_barrier} instruction pattern
 @item @samp{memory_barrier}

brgds, H-P

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-12  4:49 [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver Hans-Peter Nilsson
@ 2012-11-12  8:28 ` Eric Botcazou
  2012-11-19  3:54   ` Hans-Peter Nilsson
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Botcazou @ 2012-11-12  8:28 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: gcc-patches

> This is a target-specific blockage insn, but with the general form
> found in all targets defining it.  The default blockage is an empty
> asm-volatile, which is what cse_insn recognized.  The blockage insn is
> there to "prevent scheduling" of the critical insns and register
> values.  It's almost obvious that an unspec_volatile should have that
> effect "too"; at least that's intended by the code in
> expand_builtin_setjmp_receiver.  Luckily (or unluckily regarding the
> presence of the bug) *this* cse code is not run post-frame-layout
> (post-reload-cse does not use this code), or it seems people would
> soon notice register values used from the wrong side of the blockage,
> considering its critical use at the restore of the stack-pointer.
> As mentioned in the previous patch,
> <http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering
> the frame-pointer (and then using it) does not seem the logical
> alternative to the patch below; the blockage insn should just do its job.

Agreed.

> I updated comments and documentation so it's more apparent that
> register uses should also not be moved across the blockage; see the
> patched comments.
> 
> Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after
> 192677.  Ok to commit?
> 
> gcc:
> 	PR middle-end/55030
> 	* builtins.c (expand_builtin_setjmp_receiver): Update comment
> 	regarding purpose of blockage.
> 	* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
> 	the head comment.
> 	* doc/md.texi (blockage): Update similarly.  Change wording to
> 	require one of two forms, rather than implying a wider choice.
> 	* cse.c (cse_insn): Where checking for blocking insns, treat
> 	UNSPEC_VOLATILE as blocking, besides volatile ASM.

That's fine on principle, but there is a predicate for this (volatile_insn_p) 
so I think we should use it here.  Moreover, cselib_process_insn has the same 
check so we should adjust it as well, which in turn means that dse.c:scan_insn 
should probably be adjusted too.  OK with these changes, thanks.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-12  8:28 ` Eric Botcazou
@ 2012-11-19  3:54   ` Hans-Peter Nilsson
  2012-11-19  9:35     ` Eric Botcazou
  0 siblings, 1 reply; 19+ messages in thread
From: Hans-Peter Nilsson @ 2012-11-19  3:54 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Alexandre Oliva

On Mon, 12 Nov 2012, Eric Botcazou wrote:
> > This is a target-specific blockage insn, but with the general form
> > found in all targets defining it.  The default blockage is an empty
> > asm-volatile, which is what cse_insn recognized.  The blockage insn is
> > there to "prevent scheduling" of the critical insns and register
> > values.  It's almost obvious that an unspec_volatile should have that
> > effect "too"; at least that's intended by the code in
> > expand_builtin_setjmp_receiver.  Luckily (or unluckily regarding the
> > presence of the bug) *this* cse code is not run post-frame-layout
> > (post-reload-cse does not use this code), or it seems people would
> > soon notice register values used from the wrong side of the blockage,
> > considering its critical use at the restore of the stack-pointer.
> > As mentioned in the previous patch,
> > <http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering
> > the frame-pointer (and then using it) does not seem the logical
> > alternative to the patch below; the blockage insn should just do its job.
>
> Agreed.
>
> > I updated comments and documentation so it's more apparent that
> > register uses should also not be moved across the blockage; see the
> > patched comments.
> >
> > Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after
> > 192677.  Ok to commit?
> >
> > gcc:
> > 	PR middle-end/55030
> > 	* builtins.c (expand_builtin_setjmp_receiver): Update comment
> > 	regarding purpose of blockage.
> > 	* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
> > 	the head comment.
> > 	* doc/md.texi (blockage): Update similarly.  Change wording to
> > 	require one of two forms, rather than implying a wider choice.
> > 	* cse.c (cse_insn): Where checking for blocking insns, treat
> > 	UNSPEC_VOLATILE as blocking, besides volatile ASM.
>
> That's fine on principle, but there is a predicate for this (volatile_insn_p)
> so I think we should use it here.  Moreover, cselib_process_insn has the same
> check so we should adjust it as well, which in turn means that dse.c:scan_insn
> should probably be adjusted too.  OK with these changes, thanks.

Doh, I should have looked for that construct in other places.
Thanks for the review.  Here's an updated patch with that.
Unfortunately, it causes regressions; read on for a very brief
analysis.

For x86_64-linux (base multilib):

Running /home/hp/gcctop/tmp/pr55030-2n/gcc/gcc/testsuite/gcc.dg/guality/guality.exp ...
...
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  line 14 arg7 == 30

And for -m32:
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 14 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 14 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 14 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 14 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 14 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 12 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 12 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 12 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 12 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 12 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 14 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 14 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 14 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 14 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 14 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O2  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 12 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 12 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 12 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 12 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 12 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 14 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 14 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 14 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 14 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 14 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -fomit-frame-pointer  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 12 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 12 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 12 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 12 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 12 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 14 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 14 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 14 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 14 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 14 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O3 -g  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 12 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 12 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 12 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 12 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 12 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 14 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 14 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 14 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 14 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 14 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -Os  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 12 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 12 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 12 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 12 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 12 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 12 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 14 arg2 == 2
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 14 arg3 == 3
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 14 arg4 == 4
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 14 arg5 == 5
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 14 arg6 == 6
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 14 arg7 == 30
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  line 12 arg1 == 1
FAIL: gcc.dg/guality/pr36728-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  line 14 arg1 == 1

I looked into gcc.dg/guality/pr36728-1.c at base -O1.

The generated assembly code modulo debug info, is the same.  The
value of arg7 is optimized out, says gdb in gcc.log.  The
problem doesn't seem to be any md-generated
frame-related-barrier as was my first guess, but the volatile
asms in the source(!).  The first one looks like this (and the
second one similar) in pr36728-1.c.168r.cse1 (r193583):

(insn 26 25 27 2 (parallel [
            (set (mem/c:SI (plus:DI (reg/f:DI 20 frame)
                        (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 A256])
                (asm_operands/v:SI ("") ("=m") 0 [
                        (mem/c:SI (plus:DI (reg/f:DI 20 frame)
                                (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 A256])
                    ]
                     [
                        (asm_input:SI ("m") (null):0)
                    ]
                     [] /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12))
            (clobber (reg:QI 18 fpsr))
            (clobber (reg:QI 17 flags))
        ]) /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12 -1
     (nil))

It's not caught by the previous test:
-      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
-      && MEM_VOLATILE_P (PATTERN (insn)))

...but since it is a volatile asm (as seen in the source and by
the /v on the asm_operands) volatile_insn_p does catch it (as
expected and intended) and down the road for some reason, gcc
can't recover the arg7 contents.  Somewhat expected, but this
volatile asm is also more volatile than intended; a clobber list
for example as seen above inserted by the md, is now redundant.

I'm not sure what to do with this.  Try changing volatile_insn_p
adding a parameter to optionally allow volatile asms with
outputs to pass?  But then, when *should* that distinction be
done, to let such volatile asms be allowed to pass as
not-really-as-volatile-as-we-look-for?  I'd think "never" for
any of the the patched code, or maybe "only when looking at
effects on memory".

Maybe there's some known gotcha in debug handling.  I'm CC:ing
Alexandre, in case he happens to have input to this off the top
of his head (or has copious spare time :)

PS. without the dse.c and cselib.c changes, it passed testing.

gcc:
	PR middle-end/55030
	* builtins.c (expand_builtin_setjmp_receiver): Update comment
	regarding purpose of blockage.
	* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
	the head comment.
	* rtlanal.c (volatile_insn_p): Ditto.
	* doc/md.texi (blockage): Update similarly.  Change wording to
	require one of two forms, rather than implying a wider choice.
	* cse.c (cse_insn): Where checking for blocking insns, use
	volatile_insn_p instead of manual check for volatile ASM.
	* dse.c (scan_insn): Ditto.
	* cselib.c (cselib_process_insn): Ditto.

Index: gcc/dse.c
===================================================================
--- gcc/dse.c	(revision 192677)
+++ gcc/dse.c	(working copy)
@@ -2522,8 +2522,7 @@ scan_insn (bb_info_t bb_info, rtx insn)
   /* Cselib clears the table for this case, so we have to essentially
      do the same.  */
   if (NONJUMP_INSN_P (insn)
-      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
-      && MEM_VOLATILE_P (PATTERN (insn)))
+      && volatile_insn_p (PATTERN (insn)))
     {
       add_wild_read (bb_info);
       insn_info->cannot_delete = true;
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	(revision 192677)
+++ gcc/rtlanal.c	(working copy)
@@ -2083,8 +2083,8 @@ remove_node_from_expr_list (const_rtx no

 /* Nonzero if X contains any volatile instructions.  These are instructions
    which may cause unpredictable machine state instructions, and thus no
-   instructions should be moved or combined across them.  This includes
-   only volatile asms and UNSPEC_VOLATILE instructions.  */
+   instructions or register uses should be moved or combined across them.
+   This includes only volatile asms and UNSPEC_VOLATILE instructions.  */

 int
 volatile_insn_p (const_rtx x)
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 192677)
+++ gcc/builtins.c	(working copy)
@@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece

   /* We must not allow the code we just generated to be reordered by
      scheduling.  Specifically, the update of the frame pointer must
-     happen immediately, not later.  */
+     happen immediately, not later.  Similarly, we must block
+     (frame-related) register values to be used across this code.  */
   emit_insn (gen_blockage ());
 }

Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c	(revision 192677)
+++ gcc/cselib.c	(working copy)
@@ -2607,13 +2607,12 @@ cselib_process_insn (rtx insn)

   cselib_current_insn = insn;

-  /* Forget everything at a CODE_LABEL, a volatile asm, or a setjmp.  */
+  /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp.  */
   if (LABEL_P (insn)
       || (CALL_P (insn)
 	  && find_reg_note (insn, REG_SETJMP, NULL))
       || (NONJUMP_INSN_P (insn)
-	  && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
-	  && MEM_VOLATILE_P (PATTERN (insn))))
+	  && volatile_insn_p (PATTERN (insn))))
     {
       cselib_reset_table (next_uid);
       cselib_current_insn = NULL_RTX;
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	(revision 192677)
+++ gcc/cse.c	(working copy)
@@ -5660,10 +5660,9 @@ cse_insn (rtx insn)
 	  invalidate (XEXP (dest, 0), GET_MODE (dest));
       }

-  /* A volatile ASM invalidates everything.  */
+  /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
   if (NONJUMP_INSN_P (insn)
-      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
-      && MEM_VOLATILE_P (PATTERN (insn)))
+      && volatile_insn_p (PATTERN (insn)))
     flush_hash_table ();

   /* Don't cse over a call to setjmp; on some machines (eg VAX)
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c	(revision 192677)
+++ gcc/emit-rtl.c	(working copy)
@@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset)


 #if !HAVE_blockage
-/* Generate an empty ASM_INPUT, which is used to block attempts to schedule
-   across this insn. */
+/* Generate an empty ASM_INPUT, which is used to block attempts to schedule,
+   and to block register equivalences to be seen across this insn.  */

 rtx
 gen_blockage (void)
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 192677)
+++ gcc/doc/md.texi	(working copy)
@@ -5832,8 +5832,9 @@ the values of operands 1 and 2.
 @item @samp{blockage}

 This pattern defines a pseudo insn that prevents the instruction
-scheduler from moving instructions across the boundary defined by the
-blockage insn.  Normally an UNSPEC_VOLATILE pattern.
+scheduler and other passes from moving instructions and using register
+equivalences across the boundary defined by the blockage insn.
+This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.

 @cindex @code{memory_barrier} instruction pattern
 @item @samp{memory_barrier}

brgds, H-P

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-19  3:54   ` Hans-Peter Nilsson
@ 2012-11-19  9:35     ` Eric Botcazou
  2012-11-26  3:30       ` Hans-Peter Nilsson
  2012-11-27 11:44       ` Hans-Peter Nilsson
  0 siblings, 2 replies; 19+ messages in thread
From: Eric Botcazou @ 2012-11-19  9:35 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: gcc-patches, Alexandre Oliva, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 4251 bytes --]

> Unfortunately, it causes regressions; read on for a very brief
> analysis.
> 
> For x86_64-linux (base multilib):
> 
> Running
> /home/hp/gcctop/tmp/pr55030-2n/gcc/gcc/testsuite/gcc.dg/guality/guality.exp
> ... ...
> FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg7 == 30
> [...]
>
> I looked into gcc.dg/guality/pr36728-1.c at base -O1.
> 
> The generated assembly code modulo debug info, is the same.  The
> value of arg7 is optimized out, says gdb in gcc.log.  The
> problem doesn't seem to be any md-generated
> frame-related-barrier as was my first guess, but the volatile
> asms in the source(!).  The first one looks like this (and the
> second one similar) in pr36728-1.c.168r.cse1 (r193583):
> 
> (insn 26 25 27 2 (parallel [
>             (set (mem/c:SI (plus:DI (reg/f:DI 20 frame)
>                         (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4
> A256]) (asm_operands/v:SI ("") ("=m") 0 [
>                         (mem/c:SI (plus:DI (reg/f:DI 20 frame)
>                                 (const_int -32 [0xffffffffffffffe0])) [0 y+0
> S4 A256]) ]
>                      [
>                         (asm_input:SI ("m") (null):0)
>                     ]
>                      []
> /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12))
> (clobber (reg:QI 18 fpsr))
>             (clobber (reg:QI 17 flags))
>         ])
> /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12
> -1 (nil))
> 
> It's not caught by the previous test:
> -      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
> -      && MEM_VOLATILE_P (PATTERN (insn)))
> 
> ...but since it is a volatile asm (as seen in the source and by
> the /v on the asm_operands) volatile_insn_p does catch it (as
> expected and intended) and down the road for some reason, gcc
> can't recover the arg7 contents.  Somewhat expected, but this
> volatile asm is also more volatile than intended; a clobber list
> for example as seen above inserted by the md, is now redundant.

Thanks for the analysis.  I don't think that the redundancy of the clobber 
list matters here: the clobbers are added to all asm statements, volatile or 
not, so they aren't intended to make the volatile statements more precise in 
the hope of optimizing around them.

> I'm not sure what to do with this.  Try changing volatile_insn_p
> adding a parameter to optionally allow volatile asms with
> outputs to pass?  But then, when *should* that distinction be
> done, to let such volatile asms be allowed to pass as
> not-really-as-volatile-as-we-look-for?  I'd think "never" for
> any of the the patched code, or maybe "only when looking at
> effects on memory".

Yes, weakening volatile_insn_p seems also dangerous to me, and I'd rather lean 
towards the opposite, conservative side.

We apparently have a small conflict between the meaning of volatile asms with 
operands at the source level and volatile_insn_p.  However, I think that the 
latter interpretation is the correct one: if your asm statements have effects 
that can be described, then you should use output constraints instead of 
volatile; otherwise, you should use volatile and the output constraints have 
essentially no meaning.

The gcc.dg/guality/pr36728-[12].c testcases use volatile asms in an artificial 
way to coax the compiler into following a certain behaviour, so I don't think 
that they should be taken into account to judge the correctness of the change.
Therefore, I'd go ahead and apply the full patch below, possibly adjusting 
both testcases to cope with it.  Tentative (and untested) patch attached.

I've also CCed Jakub though, as he might have a different opinion.

> gcc:
> 	PR middle-end/55030
> 	* builtins.c (expand_builtin_setjmp_receiver): Update comment
> 	regarding purpose of blockage.
> 	* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
> 	the head comment.
> 	* rtlanal.c (volatile_insn_p): Ditto.
> 	* doc/md.texi (blockage): Update similarly.  Change wording to
> 	require one of two forms, rather than implying a wider choice.
> 	* cse.c (cse_insn): Where checking for blocking insns, use
> 	volatile_insn_p instead of manual check for volatile ASM.
> 	* dse.c (scan_insn): Ditto.
> 	* cselib.c (cselib_process_insn): Ditto.

-- 
Eric Botcazou

[-- Attachment #2: p.diff --]
[-- Type: text/x-patch, Size: 2027 bytes --]

Index: gcc.dg/guality/pr36728-1.c
===================================================================
--- gcc.dg/guality/pr36728-1.c	(revision 193596)
+++ gcc.dg/guality/pr36728-1.c	(working copy)
@@ -9,10 +9,10 @@ foo (int arg1, int arg2, int arg3, int a
   int __attribute__ ((aligned(32))) y;
 
   y = 2;
-  asm volatile ("" : "=m" (y) : "m" (y));
+  asm ("" : "=m" (y) : "m" (y));
   x[0] = 25;
-  asm volatile ("" : "=m" (x[0]) : "m" (x[0]));
-  return y;
+  asm ("" : "=m" (x[0]) : "m" (x[0]));
+  return y + x[0];
 }
 
 /* On s390(x) r2 and r3 are (depending on the optimization level) used
@@ -39,11 +39,13 @@ foo (int arg1, int arg2, int arg3, int a
 /* { dg-final { gdb-test 14 "*x" "(char) 25" } } */
 /* { dg-final { gdb-test 14 "y" "2" } } */
 
+int a;
+
 int
 main ()
 {
   int l = 0;
-  asm volatile ("" : "=r" (l) : "0" (l));
-  foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
+  asm ("" : "=r" (l) : "0" (l));
+  a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
   return 0;
 }
Index: gcc.dg/guality/pr36728-2.c
===================================================================
--- gcc.dg/guality/pr36728-2.c	(revision 193596)
+++ gcc.dg/guality/pr36728-2.c	(working copy)
@@ -9,10 +9,10 @@ foo (int arg1, int arg2, int arg3, int a
   int __attribute__ ((aligned(32))) y;
 
   y = 2;
-  asm volatile ("" : "=m" (y) : "m" (y));
+  asm ("" : "=m" (y) : "m" (y));
   x[0] = 25;
-  asm volatile ("" : "=m" (x[0]) : "m" (x[0]));
-  return y;
+  asm ("" : "=m" (x[0]) : "m" (x[0]));
+  return y + x[0];
 }
 
 /* On s390(x) r2 and r3 are (depending on the optimization level) used
@@ -39,11 +39,13 @@ foo (int arg1, int arg2, int arg3, int a
 /* { dg-final { gdb-test 14 "*x" "(char) 25" } } */
 /* { dg-final { gdb-test 14 "y" "2" } } */
 
+int a;
+
 int
 main ()
 {
   int l = 0;
-  asm volatile ("" : "=r" (l) : "0" (l));
-  foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
+  asm ("" : "=r" (l) : "0" (l));
+  a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
   return 0;
 }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-19  9:35     ` Eric Botcazou
@ 2012-11-26  3:30       ` Hans-Peter Nilsson
  2012-11-27 11:44       ` Hans-Peter Nilsson
  1 sibling, 0 replies; 19+ messages in thread
From: Hans-Peter Nilsson @ 2012-11-26  3:30 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Alexandre Oliva, Jakub Jelinek

Thanks for the reviews!

On Mon, 19 Nov 2012, Eric Botcazou wrote:
> Thanks for the analysis.  I don't think that the redundancy of the clobber
> list matters here: the clobbers are added to all asm statements, volatile or
> not, so they aren't intended to make the volatile statements more precise in
> the hope of optimizing around them.

Oh, right.

> > I'm not sure what to do with this.  Try changing volatile_insn_p
> > adding a parameter to optionally allow volatile asms with
> > outputs to pass?  But then, when *should* that distinction be
> > done, to let such volatile asms be allowed to pass as
> > not-really-as-volatile-as-we-look-for?  I'd think "never" for
> > any of the the patched code, or maybe "only when looking at
> > effects on memory".
>
> Yes, weakening volatile_insn_p seems also dangerous to me, and I'd rather lean
> towards the opposite, conservative side.
>
> We apparently have a small conflict between the meaning of volatile asms with
> operands at the source level and volatile_insn_p.  However, I think that the
> latter interpretation is the correct one: if your asm statements have effects
> that can be described, then you should use output constraints instead of
> volatile; otherwise, you should use volatile and the output constraints have
> essentially no meaning.
>
> The gcc.dg/guality/pr36728-[12].c testcases use volatile asms in an artificial
> way to coax the compiler into following a certain behaviour, so I don't think
> that they should be taken into account to judge the correctness of the change.
> Therefore, I'd go ahead and apply the full patch below, possibly adjusting
> both testcases to cope with it.  Tentative (and untested) patch attached.
>
> I've also CCed Jakub though, as he might have a different opinion.

Ok, no comments, so let's go with adjusting the test-cases.
But, your patch for the test-cases doesn't work with (or
without) the patch for pr36728-2.c.  Bonus points for being very
close, though!

I don't really see *why* your patch shouldn't work.
Specifically, I don't see why the *wrong value* is displayed
instead of at least "<optimized out>" for any changed behavior,
so I entered PR55467 for the related observations.

By instead of your approach; not using x[0] after the asm, and
instead changing the asm to also write the result to an external
variable (i.e. an unknown operation writing there and x[0], with
x[0] as input), I think I mimicked the intended behavior, such
that x[0] is preserved as a variable until the line with the
asm.  At least nothing changes for -O1 besides debug info and
the write to the variable "a" in main, for an unpatched gcc
with/without patched test-case. (The patched test-cases all pass
for an otherwise unpatched gcc at r193777.)

Committed after committing the reapplied previous patch; r193802
and r193803.

> > gcc:
> > 	PR middle-end/55030
> > 	* builtins.c (expand_builtin_setjmp_receiver): Update comment
> > 	regarding purpose of blockage.
> > 	* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
> > 	the head comment.
> > 	* rtlanal.c (volatile_insn_p): Ditto.
> > 	* doc/md.texi (blockage): Update similarly.  Change wording to
> > 	require one of two forms, rather than implying a wider choice.
> > 	* cse.c (cse_insn): Where checking for blocking insns, use
> > 	volatile_insn_p instead of manual check for volatile ASM.
> > 	* dse.c (scan_insn): Ditto.
> > 	* cselib.c (cselib_process_insn): Ditto.

testsuite:
	PR middle-end/55030
	* gcc.dg/guality/pr36728-1.c, gcc.dg/guality/pr36728-2.c (foo): Don't
	use volatile asms, use plain asms.   Where the output value for the
	asm is unused, write a global variable.

Index: gcc/testsuite/gcc.dg/guality/pr36728-1.c
===================================================================
--- gcc/testsuite/gcc.dg/guality/pr36728-1.c	(revision 193776)
+++ gcc/testsuite/gcc.dg/guality/pr36728-1.c	(working copy)
@@ -1,7 +1,7 @@
 /* PR debug/36728 */
 /* { dg-do run } */
 /* { dg-options "-g" } */
-
+int a;
 int __attribute__((noinline))
 foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7)
 {
@@ -9,9 +9,9 @@ foo (int arg1, int arg2, int arg3, int a
   int __attribute__ ((aligned(32))) y;

   y = 2;
-  asm volatile ("" : "=m" (y) : "m" (y));
+  asm ("" : "=m" (y) : "m" (y));
   x[0] = 25;
-  asm volatile ("" : "=m" (x[0]) : "m" (x[0]));
+  asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0]));
   return y;
 }

@@ -43,7 +43,7 @@ int
 main ()
 {
   int l = 0;
-  asm volatile ("" : "=r" (l) : "0" (l));
-  foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
+  asm ("" : "=r" (l) : "0" (l));
+  a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
   return 0;
 }
Index: gcc/testsuite/gcc.dg/guality/pr36728-2.c
===================================================================
--- gcc/testsuite/gcc.dg/guality/pr36728-2.c	(revision 193776)
+++ gcc/testsuite/gcc.dg/guality/pr36728-2.c	(working copy)
@@ -1,7 +1,7 @@
 /* PR debug/36728 */
 /* { dg-do run } */
 /* { dg-options "-g" } */
-
+int a;
 int __attribute__((noinline))
 foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7)
 {
@@ -9,9 +9,9 @@ foo (int arg1, int arg2, int arg3, int a
   int __attribute__ ((aligned(32))) y;

   y = 2;
-  asm volatile ("" : "=m" (y) : "m" (y));
+  asm ("" : "=m" (y) : "m" (y));
   x[0] = 25;
-  asm volatile ("" : "=m" (x[0]) : "m" (x[0]));
+  asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0]));
   return y;
 }

@@ -43,7 +43,7 @@ int
 main ()
 {
   int l = 0;
-  asm volatile ("" : "=r" (l) : "0" (l));
-  foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
+  asm ("" : "=r" (l) : "0" (l));
+  a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
   return 0;
 }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-19  9:35     ` Eric Botcazou
  2012-11-26  3:30       ` Hans-Peter Nilsson
@ 2012-11-27 11:44       ` Hans-Peter Nilsson
  2012-11-27 12:04         ` Jakub Jelinek
  1 sibling, 1 reply; 19+ messages in thread
From: Hans-Peter Nilsson @ 2012-11-27 11:44 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Alexandre Oliva, Jakub Jelinek

I quoted the whole discussion, see a single line below.

On Mon, 19 Nov 2012, Eric Botcazou wrote:
> > Unfortunately, it causes regressions; read on for a very brief
> > analysis.
> >
> > For x86_64-linux (base multilib):
> >
> > Running
> > /home/hp/gcctop/tmp/pr55030-2n/gcc/gcc/testsuite/gcc.dg/guality/guality.exp
> > ... ...
> > FAIL: gcc.dg/guality/pr36728-1.c  -O1  line 12 arg7 == 30
> > [...]
> >
> > I looked into gcc.dg/guality/pr36728-1.c at base -O1.
> >
> > The generated assembly code modulo debug info, is the same.  The
> > value of arg7 is optimized out, says gdb in gcc.log.  The
> > problem doesn't seem to be any md-generated
> > frame-related-barrier as was my first guess, but the volatile
> > asms in the source(!).  The first one looks like this (and the
> > second one similar) in pr36728-1.c.168r.cse1 (r193583):
> >
> > (insn 26 25 27 2 (parallel [
> >             (set (mem/c:SI (plus:DI (reg/f:DI 20 frame)
> >                         (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4
> > A256]) (asm_operands/v:SI ("") ("=m") 0 [
> >                         (mem/c:SI (plus:DI (reg/f:DI 20 frame)
> >                                 (const_int -32 [0xffffffffffffffe0])) [0 y+0
> > S4 A256]) ]
> >                      [
> >                         (asm_input:SI ("m") (null):0)
> >                     ]
> >                      []
> > /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12))
> > (clobber (reg:QI 18 fpsr))
> >             (clobber (reg:QI 17 flags))
> >         ])
> > /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12
> > -1 (nil))
> >
> > It's not caught by the previous test:
> > -      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
> > -      && MEM_VOLATILE_P (PATTERN (insn)))
> >
> > ...but since it is a volatile asm (as seen in the source and by
> > the /v on the asm_operands) volatile_insn_p does catch it (as
> > expected and intended) and down the road for some reason, gcc
> > can't recover the arg7 contents.  Somewhat expected, but this
> > volatile asm is also more volatile than intended; a clobber list
> > for example as seen above inserted by the md, is now redundant.
>
> Thanks for the analysis.  I don't think that the redundancy of the clobber
> list matters here: the clobbers are added to all asm statements, volatile or
> not, so they aren't intended to make the volatile statements more precise in
> the hope of optimizing around them.

IIIUC, Jakub disagrees on this point, see
<http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55467#c13>.

Sigh, hoping we can get consensus on this point, or at least
that gcc has a consistent view...

> > I'm not sure what to do with this.  Try changing volatile_insn_p
> > adding a parameter to optionally allow volatile asms with
> > outputs to pass?  But then, when *should* that distinction be
> > done, to let such volatile asms be allowed to pass as
> > not-really-as-volatile-as-we-look-for?  I'd think "never" for
> > any of the the patched code, or maybe "only when looking at
> > effects on memory".
>
> Yes, weakening volatile_insn_p seems also dangerous to me, and I'd rather lean
> towards the opposite, conservative side.
>
> We apparently have a small conflict between the meaning of volatile asms with
> operands at the source level and volatile_insn_p.  However, I think that the
> latter interpretation is the correct one: if your asm statements have effects
> that can be described, then you should use output constraints instead of
> volatile; otherwise, you should use volatile and the output constraints have
> essentially no meaning.
>
> The gcc.dg/guality/pr36728-[12].c testcases use volatile asms in an artificial
> way to coax the compiler into following a certain behaviour, so I don't think
> that they should be taken into account to judge the correctness of the change.
> Therefore, I'd go ahead and apply the full patch below, possibly adjusting
> both testcases to cope with it.  Tentative (and untested) patch attached.
>
> I've also CCed Jakub though, as he might have a different opinion.
>
> > gcc:
> > 	PR middle-end/55030
> > 	* builtins.c (expand_builtin_setjmp_receiver): Update comment
> > 	regarding purpose of blockage.
> > 	* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
> > 	the head comment.
> > 	* rtlanal.c (volatile_insn_p): Ditto.
> > 	* doc/md.texi (blockage): Update similarly.  Change wording to
> > 	require one of two forms, rather than implying a wider choice.
> > 	* cse.c (cse_insn): Where checking for blocking insns, use
> > 	volatile_insn_p instead of manual check for volatile ASM.
> > 	* dse.c (scan_insn): Ditto.
> > 	* cselib.c (cselib_process_insn): Ditto.
>
> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 11:44       ` Hans-Peter Nilsson
@ 2012-11-27 12:04         ` Jakub Jelinek
  2012-11-27 12:27           ` Hans-Peter Nilsson
                             ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Jakub Jelinek @ 2012-11-27 12:04 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva

On Tue, Nov 27, 2012 at 06:44:23AM -0500, Hans-Peter Nilsson wrote:
> > We apparently have a small conflict between the meaning of volatile asms with
> > operands at the source level and volatile_insn_p.  However, I think that the
> > latter interpretation is the correct one: if your asm statements have effects
> > that can be described, then you should use output constraints instead of
> > volatile; otherwise, you should use volatile and the output constraints have
> > essentially no meaning.

I strongly disagree with this.  Outputs and clobbers have significant
meaning even on volatile asms, asm volatile doesn't mean all registers and
all memory are supposed to be considered clobbered.  For memory you have
"memory" clobber for that, for registers it is users responsibility to
describe exactly what changes, either in clobbers or in outputs.
The difference between non-volatile and volatile asm is, as the
documentation states:

The `volatile' keyword indicates that the instruction has important
side-effects.  GCC will not delete a volatile `asm' if it is reachable.

Volatile asm acts as an optimization barrier to some extent, but it really
can't modify registers or memory that aren't described as modified in the
asm pattern.  The important side-effects are of some other kind than
modifying registers or memory visible from the current function.
Ditto for UNSPEC_VOLATILE.

So, at least from var-tracking POV which doesn't attempt to perform any
optimizations across any insn, just tries to track where values live, IMHO a
volatile asm acts exactly the same as non-volatile, that is why I'm testing
following patch right now.

But the question is also what effects your patch can have e.g. on RTL DSE.

2012-11-26  Jakub Jelinek  <jakub@redhat.com>

	PR debug/36728
	PR debug/55467
	* cselib.c (cselib_process_insn): If cselib_preserve_constants,
	don't reset table and exit early on volatile insns and setjmp.
	Reset table afterwards on setjmp.

	* gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm
	are non-empty and add dependency between the first and second asm.
	* gcc.dg/guality/pr36728-2.c: Likewise.
	* gcc.dg/guality/pr36728-3.c: New test.
	* gcc.dg/guality/pr36728-4.c: New test.

--- gcc/cselib.c.jj	2012-11-26 10:14:26.000000000 +0100
+++ gcc/cselib.c	2012-11-26 20:01:10.488304558 +0100
@@ -2626,11 +2626,12 @@ cselib_process_insn (rtx insn)
   cselib_current_insn = insn;
 
   /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp.  */
-  if (LABEL_P (insn)
-      || (CALL_P (insn)
-	  && find_reg_note (insn, REG_SETJMP, NULL))
-      || (NONJUMP_INSN_P (insn)
-	  && volatile_insn_p (PATTERN (insn))))
+  if ((LABEL_P (insn)
+       || (CALL_P (insn)
+	   && find_reg_note (insn, REG_SETJMP, NULL))
+       || (NONJUMP_INSN_P (insn)
+	   && volatile_insn_p (PATTERN (insn))))
+      && !cselib_preserve_constants)
     {
       cselib_reset_table (next_uid);
       cselib_current_insn = NULL_RTX;
@@ -2668,9 +2669,18 @@ cselib_process_insn (rtx insn)
   /* Look for any CLOBBERs in CALL_INSN_FUNCTION_USAGE, but only
      after we have processed the insn.  */
   if (CALL_P (insn))
-    for (x = CALL_INSN_FUNCTION_USAGE (insn); x; x = XEXP (x, 1))
-      if (GET_CODE (XEXP (x, 0)) == CLOBBER)
-	cselib_invalidate_rtx (XEXP (XEXP (x, 0), 0));
+    {
+      for (x = CALL_INSN_FUNCTION_USAGE (insn); x; x = XEXP (x, 1))
+	if (GET_CODE (XEXP (x, 0)) == CLOBBER)
+	  cselib_invalidate_rtx (XEXP (XEXP (x, 0), 0));
+      /* Flush evertything on setjmp.  */
+      if (cselib_preserve_constants
+	  && find_reg_note (insn, REG_SETJMP, NULL))
+	{
+	  cselib_preserve_only_values ();
+	  cselib_reset_table (next_uid);
+	}
+    }
 
   /* On setter of the hard frame pointer if frame_pointer_needed,
      invalidate stack_pointer_rtx, so that sp and {,h}fp based
--- gcc/testsuite/gcc.dg/guality/pr36728-1.c.jj	2012-11-26 10:14:25.000000000 +0100
+++ gcc/testsuite/gcc.dg/guality/pr36728-1.c	2012-11-27 10:01:14.517080157 +0100
@@ -1,7 +1,11 @@
 /* PR debug/36728 */
 /* { dg-do run } */
 /* { dg-options "-g" } */
-int a;
+
+#include "../nop.h"
+
+int a, b;
+
 int __attribute__((noinline))
 foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7)
 {
@@ -9,9 +13,9 @@ foo (int arg1, int arg2, int arg3, int a
   int __attribute__ ((aligned(32))) y;
 
   y = 2;
-  asm ("" : "=m" (y) : "m" (y));
+  asm (NOP : "=m" (y), "=m" (b) : "m" (y));
   x[0] = 25;
-  asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0]));
+  asm (NOP : "=m" (x[0]), "=m" (a) : "m" (x[0]), "m" (b));
   return y;
 }
 
@@ -21,23 +25,23 @@ foo (int arg1, int arg2, int arg3, int a
    and arg2.  So it is expected that these values are unavailable in
    some of these tests.  */
 
-/* { dg-final { gdb-test 12 "arg1" "1" { target { ! "s390*-*-*" } } } }*/
-/* { dg-final { gdb-test 12 "arg2" "2" { target { ! "s390*-*-*" } } } }*/
-/* { dg-final { gdb-test 12 "arg3" "3" } } */
-/* { dg-final { gdb-test 12 "arg4" "4" } } */
-/* { dg-final { gdb-test 12 "arg5" "5" } } */
-/* { dg-final { gdb-test 12 "arg6" "6" } } */
-/* { dg-final { gdb-test 12 "arg7" "30" } } */
-/* { dg-final { gdb-test 12 "y" "2" } } */
-/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } }*/
-/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } }*/
-/* { dg-final { gdb-test 14 "arg3" "3" } } */
-/* { dg-final { gdb-test 14 "arg4" "4" } } */
-/* { dg-final { gdb-test 14 "arg5" "5" } } */
-/* { dg-final { gdb-test 14 "arg6" "6" } } */
-/* { dg-final { gdb-test 14 "arg7" "30" } } */
-/* { dg-final { gdb-test 14 "*x" "(char) 25" } } */
-/* { dg-final { gdb-test 14 "y" "2" } } */
+/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg3" "3" } } */
+/* { dg-final { gdb-test 16 "arg4" "4" } } */
+/* { dg-final { gdb-test 16 "arg5" "5" } } */
+/* { dg-final { gdb-test 16 "arg6" "6" } } */
+/* { dg-final { gdb-test 16 "arg7" "30" } } */
+/* { dg-final { gdb-test 16 "y" "2" } } */
+/* { dg-final { gdb-test 18 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 18 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 18 "arg3" "3" } } */
+/* { dg-final { gdb-test 18 "arg4" "4" } } */
+/* { dg-final { gdb-test 18 "arg5" "5" } } */
+/* { dg-final { gdb-test 18 "arg6" "6" } } */
+/* { dg-final { gdb-test 18 "arg7" "30" } } */
+/* { dg-final { gdb-test 18 "*x" "(char) 25" } } */
+/* { dg-final { gdb-test 18 "y" "2" } } */
 
 int
 main ()
--- gcc/testsuite/gcc.dg/guality/pr36728-2.c.jj	2012-11-26 10:14:25.000000000 +0100
+++ gcc/testsuite/gcc.dg/guality/pr36728-2.c	2012-11-27 10:00:46.237254022 +0100
@@ -1,7 +1,11 @@
 /* PR debug/36728 */
 /* { dg-do run } */
 /* { dg-options "-g" } */
-int a;
+
+#include "../nop.h"
+
+int a, b;
+
 int __attribute__((noinline))
 foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7)
 {
@@ -9,9 +13,9 @@ foo (int arg1, int arg2, int arg3, int a
   int __attribute__ ((aligned(32))) y;
 
   y = 2;
-  asm ("" : "=m" (y) : "m" (y));
+  asm (NOP : "=m" (y), "=m" (b) : "m" (y));
   x[0] = 25;
-  asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0]));
+  asm (NOP : "=m" (x[0]), "=m" (a) : "m" (x[0]), "m" (b));
   return y;
 }
 
@@ -21,23 +25,23 @@ foo (int arg1, int arg2, int arg3, int a
    and arg2.  So it is expected that these values are unavailable in
    some of these tests.  */
 
-/* { dg-final { gdb-test 12 "arg1" "1" { target { ! "s390*-*-*" } } } } */
-/* { dg-final { gdb-test 12 "arg2" "2" { target { ! "s390*-*-*" } } } } */
-/* { dg-final { gdb-test 12 "arg3" "3" } } */
-/* { dg-final { gdb-test 12 "arg4" "4" } } */
-/* { dg-final { gdb-test 12 "arg5" "5" } } */
-/* { dg-final { gdb-test 12 "arg6" "6" } } */
-/* { dg-final { gdb-test 12 "arg7" "30" } } */
-/* { dg-final { gdb-test 12 "y" "2" } } */
-/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } } */
-/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } } */
-/* { dg-final { gdb-test 14 "arg3" "3" } } */
-/* { dg-final { gdb-test 14 "arg4" "4" } } */
-/* { dg-final { gdb-test 14 "arg5" "5" } } */
-/* { dg-final { gdb-test 14 "arg6" "6" } } */
-/* { dg-final { gdb-test 14 "arg7" "30" } } */
-/* { dg-final { gdb-test 14 "*x" "(char) 25" } } */
-/* { dg-final { gdb-test 14 "y" "2" } } */
+/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg3" "3" } } */
+/* { dg-final { gdb-test 16 "arg4" "4" } } */
+/* { dg-final { gdb-test 16 "arg5" "5" } } */
+/* { dg-final { gdb-test 16 "arg6" "6" } } */
+/* { dg-final { gdb-test 16 "arg7" "30" } } */
+/* { dg-final { gdb-test 16 "y" "2" } } */
+/* { dg-final { gdb-test 18 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 18 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 18 "arg3" "3" } } */
+/* { dg-final { gdb-test 18 "arg4" "4" } } */
+/* { dg-final { gdb-test 18 "arg5" "5" } } */
+/* { dg-final { gdb-test 18 "arg6" "6" } } */
+/* { dg-final { gdb-test 18 "arg7" "30" } } */
+/* { dg-final { gdb-test 18 "*x" "(char) 25" } } */
+/* { dg-final { gdb-test 18 "y" "2" } } */
 
 int
 main ()
--- gcc/testsuite/gcc.dg/guality/pr36728-3.c.jj	2012-11-26 18:58:40.896599886 +0100
+++ gcc/testsuite/gcc.dg/guality/pr36728-3.c	2012-11-27 10:01:45.455892012 +0100
@@ -0,0 +1,51 @@
+/* PR debug/36728 */
+/* { dg-do run } */
+/* { dg-options "-g" } */
+
+#include "../nop.h"
+
+int __attribute__((noinline))
+foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7)
+{
+  char *x = __builtin_alloca (arg7);
+  int __attribute__ ((aligned(32))) y;
+
+  y = 2;
+  asm (NOP : "=m" (y) : "m" (y));
+  x[0] = 25;
+  asm volatile (NOP : "=m" (x[0]) : "m" (x[0]));
+  return y;
+}
+
+/* On s390(x) r2 and r3 are (depending on the optimization level) used
+   when adjusting the addresses in order to meet the alignment
+   requirements above.  They usually hold the function arguments arg1
+   and arg2.  So it is expected that these values are unavailable in
+   some of these tests.  */
+
+/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 14 "arg3" "3" } } */
+/* { dg-final { gdb-test 14 "arg4" "4" } } */
+/* { dg-final { gdb-test 14 "arg5" "5" } } */
+/* { dg-final { gdb-test 14 "arg6" "6" } } */
+/* { dg-final { gdb-test 14 "arg7" "30" } } */
+/* { dg-final { gdb-test 14 "y" "2" } } */
+/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg3" "3" } } */
+/* { dg-final { gdb-test 16 "arg4" "4" } } */
+/* { dg-final { gdb-test 16 "arg5" "5" } } */
+/* { dg-final { gdb-test 16 "arg6" "6" } } */
+/* { dg-final { gdb-test 16 "arg7" "30" } } */
+/* { dg-final { gdb-test 16 "*x" "(char) 25" } } */
+/* { dg-final { gdb-test 16 "y" "2" } } */
+
+int
+main ()
+{
+  int l = 0;
+  asm volatile ("" : "=r" (l) : "0" (l));
+  foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
+  return 0;
+}
--- gcc/testsuite/gcc.dg/guality/pr36728-4.c.jj	2012-11-26 18:58:44.624580656 +0100
+++ gcc/testsuite/gcc.dg/guality/pr36728-4.c	2012-11-26 14:56:12.000000000 +0100
@@ -0,0 +1,51 @@
+/* PR debug/36728 */
+/* { dg-do run } */
+/* { dg-options "-g" } */
+
+#include "../nop.h"
+
+int __attribute__((noinline))
+foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7)
+{
+  char x[30];
+  int __attribute__ ((aligned(32))) y;
+
+  y = 2;
+  asm (NOP : "=m" (y) : "m" (y));
+  x[0] = 25;
+  asm volatile (NOP : "=m" (x[0]) : "m" (x[0]));
+  return y;
+}
+
+/* On s390(x) r2 and r3 are (depending on the optimization level) used
+   when adjusting the addresses in order to meet the alignment
+   requirements above.  They usually hold the function arguments arg1
+   and arg2.  So it is expected that these values are unavailable in
+   some of these tests.  */
+
+/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 14 "arg3" "3" } } */
+/* { dg-final { gdb-test 14 "arg4" "4" } } */
+/* { dg-final { gdb-test 14 "arg5" "5" } } */
+/* { dg-final { gdb-test 14 "arg6" "6" } } */
+/* { dg-final { gdb-test 14 "arg7" "30" } } */
+/* { dg-final { gdb-test 14 "y" "2" } } */
+/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */
+/* { dg-final { gdb-test 16 "arg3" "3" } } */
+/* { dg-final { gdb-test 16 "arg4" "4" } } */
+/* { dg-final { gdb-test 16 "arg5" "5" } } */
+/* { dg-final { gdb-test 16 "arg6" "6" } } */
+/* { dg-final { gdb-test 16 "arg7" "30" } } */
+/* { dg-final { gdb-test 16 "*x" "(char) 25" } } */
+/* { dg-final { gdb-test 16 "y" "2" } } */
+
+int
+main ()
+{
+  int l = 0;
+  asm volatile ("" : "=r" (l) : "0" (l));
+  foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30);
+  return 0;
+}


	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 12:04         ` Jakub Jelinek
@ 2012-11-27 12:27           ` Hans-Peter Nilsson
  2012-11-27 13:32           ` Jakub Jelinek
                             ` (3 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Hans-Peter Nilsson @ 2012-11-27 12:27 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva

On Tue, 27 Nov 2012, Jakub Jelinek wrote:
> On Tue, Nov 27, 2012 at 06:44:23AM -0500, Hans-Peter Nilsson wrote:

JFTR: No I didn't, Eric wrote the below.  But, it made sense to me. :)

> > > We apparently have a small conflict between the meaning of volatile asms with
> > > operands at the source level and volatile_insn_p.  However, I think that the
> > > latter interpretation is the correct one: if your asm statements have effects
> > > that can be described, then you should use output constraints instead of
> > > volatile; otherwise, you should use volatile and the output constraints have
> > > essentially no meaning.
>
> I strongly disagree with this.
> [...]

As long as volatile asms and UNSPEC_VOLATILE insns (aka.
barriers) are handled the same way and consistently throughout
gcc, I'm fine.  It seems your patch does that, so thanks!

> But the question is also what effects your patch can have e.g. on RTL DSE.

Looks like the patch caused a bootstrap for s390x.

Eagerly awaiting a PR for that, but whoever is interested
on that, please try Jakub's patch first...

> 2012-11-26  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR debug/36728
> 	PR debug/55467
> 	* cselib.c (cselib_process_insn): If cselib_preserve_constants,
> 	don't reset table and exit early on volatile insns and setjmp.
> 	Reset table afterwards on setjmp.
>
> 	* gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm
> 	are non-empty and add dependency between the first and second asm.
> 	* gcc.dg/guality/pr36728-2.c: Likewise.
> 	* gcc.dg/guality/pr36728-3.c: New test.
> 	* gcc.dg/guality/pr36728-4.c: New test.

brgds, H-P

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 12:04         ` Jakub Jelinek
  2012-11-27 12:27           ` Hans-Peter Nilsson
@ 2012-11-27 13:32           ` Jakub Jelinek
  2012-11-27 17:39           ` Eric Botcazou
                             ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Jakub Jelinek @ 2012-11-27 13:32 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva

On Tue, Nov 27, 2012 at 01:03:47PM +0100, Jakub Jelinek wrote:
> So, at least from var-tracking POV which doesn't attempt to perform any
> optimizations across any insn, just tries to track where values live, IMHO a
> volatile asm acts exactly the same as non-volatile, that is why I'm testing
> following patch right now.
> 
> 2012-11-26  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR debug/36728
> 	PR debug/55467
> 	* cselib.c (cselib_process_insn): If cselib_preserve_constants,
> 	don't reset table and exit early on volatile insns and setjmp.
> 	Reset table afterwards on setjmp.
> 
> 	* gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm
> 	are non-empty and add dependency between the first and second asm.
> 	* gcc.dg/guality/pr36728-2.c: Likewise.
> 	* gcc.dg/guality/pr36728-3.c: New test.
> 	* gcc.dg/guality/pr36728-4.c: New test.

Bootstrapped/regtested fine on x86_64-linux and i686-linux (and fixes the
pr36728-[34].c failures that appear without the patch, which are of the
wrong debug kind).

	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 12:04         ` Jakub Jelinek
  2012-11-27 12:27           ` Hans-Peter Nilsson
  2012-11-27 13:32           ` Jakub Jelinek
@ 2012-11-27 17:39           ` Eric Botcazou
  2012-11-27 18:20             ` Jakub Jelinek
                               ` (2 more replies)
  2012-11-28 13:44           ` Hans-Peter Nilsson
  2012-11-28 20:53           ` Richard Henderson
  4 siblings, 3 replies; 19+ messages in thread
From: Eric Botcazou @ 2012-11-27 17:39 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Hans-Peter Nilsson, Alexandre Oliva

> I strongly disagree with this.  Outputs and clobbers have significant
> meaning even on volatile asms, asm volatile doesn't mean all registers and
> all memory are supposed to be considered clobbered.  For memory you have
> "memory" clobber for that, for registers it is users responsibility to
> describe exactly what changes, either in clobbers or in outputs.
> The difference between non-volatile and volatile asm is, as the
> documentation states:
> 
> The `volatile' keyword indicates that the instruction has important
> side-effects.  GCC will not delete a volatile `asm' if it is reachable.
> 
> Volatile asm acts as an optimization barrier to some extent, but it really
> can't modify registers or memory that aren't described as modified in the
> asm pattern.  The important side-effects are of some other kind than
> modifying registers or memory visible from the current function.
> Ditto for UNSPEC_VOLATILE.

Well, the last sentence would essentially void the entire argument I think.  
It's well established in the RTL middle-end that UNSPEC_VOLATILE can do pretty 
much everything behind the back of the compiler.

Now I agree that the discussion exists for volatile asms.  But you have for 
example in the unmodified cse.c:

  /* A volatile ASM invalidates everything.  */
  if (NONJUMP_INSN_P (insn)
      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
      && MEM_VOLATILE_P (PATTERN (insn)))
    flush_hash_table ();

and the comment of volatile_insn_p is rather explicit:

/* Nonzero if X contains any volatile instructions.  These are instructions
   which may cause unpredictable machine state instructions, and thus no
   instructions should be moved or combined across them.  This includes
   only volatile asms and UNSPEC_VOLATILE instructions.  */

The problem is that the various RTL passes don't have a consistent view on 
that so the patch attempts to tidy this up in a conservative manner.

I think that a compromise could be to say that volatile asms without outputs 
are full barriers (like UNSPEC_VOLATILE) but other volatile asms aren't.
That's what the unmodified cse.c, cselib.c and dse.c currently implement.
But implementing it consistently would mean weakening volatile_insn_p.

> So, at least from var-tracking POV which doesn't attempt to perform any
> optimizations across any insn, just tries to track where values live, IMHO a
> volatile asm acts exactly the same as non-volatile, that is why I'm testing
> following patch right now.
> 
> But the question is also what effects your patch can have e.g. on RTL DSE.

It will make all volatile asms be treated as volatile asms without outputs.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 17:39           ` Eric Botcazou
@ 2012-11-27 18:20             ` Jakub Jelinek
  2012-12-01  0:03               ` Eric Botcazou
  2012-11-28 20:10             ` Richard Sandiford
  2012-11-28 20:46             ` Richard Henderson
  2 siblings, 1 reply; 19+ messages in thread
From: Jakub Jelinek @ 2012-11-27 18:20 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Hans-Peter Nilsson, Alexandre Oliva

On Tue, Nov 27, 2012 at 06:35:52PM +0100, Eric Botcazou wrote:
> Now I agree that the discussion exists for volatile asms.  But you have for 
> example in the unmodified cse.c:
> 
>   /* A volatile ASM invalidates everything.  */
>   if (NONJUMP_INSN_P (insn)
>       && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
>       && MEM_VOLATILE_P (PATTERN (insn)))
>     flush_hash_table ();

That was weird indeed, because
  asm volatile ("" : : "r" (x));
flushed (for targets that don't add any implicit clobbers) but e.g.
  asm volatile ("" : : "r" (x) : "r23");
wouldn't (then the pattern is a parallel with ASM_OPERANDS and some
CLOBBERs).  The effect of that is that it never triggered on i?86/x86_64,
because those have the implicit fprs/flags clobbers.

I was mainly arguing with the sentence that for asm volatile, the output
constraints (did you mean outputs and clobbers?) have essentially no
meaning.  While for some optimizations perhaps it might be desirable
to treat asm volatile as full optimization barrier, I'm not sure about all
of them (i.e. it would be important to look for performance degradations
caused by that change), and for var-tracking I'd argue that asm vs. asm
volatile is completely unimportant, if the asm volatile pattern (or even
UNSPEC_VOLATILE) doesn't say it clobbers or sets hard register XYZ, then it
can't change it (well, it could but is responsible of restoring it),
similarly for memory, if it doesn't clobber "memory" and doesn't set some
memory, it can't do that.

So, do you object even to the var-tracking change (which would mean if
var-tracking found that say some variable's value lives in hard register 12, when
encountering asm volatile it would mean to forget about that even when
hard register 12 isn't clobbered by the asm, nor set)?  For now var-tracking
seems to be the most important issue, as we generate invalid debug info
there (latent before, but never hit on inline asm on x86_64/i686 except
on setjmp calls).

As for other passes, the r193802 change e.g. regresses slightly modified
pr44194-1.c testcase on x86_64,
struct ints { int a, b, c; } foo();
void bar(int a, int b);

void func() {
  struct ints s = foo();
  int x;
  asm volatile ("" : "=r" (x));
  bar(s.a, s.b);
}

 func:
 .LFB0:
-	xorl	%eax, %eax
 	subq	$24, %rsp
 .LCFI0:
+	xorl	%eax, %eax
 	call	foo
-	movq	%rax, %rcx
-	shrq	$32, %rcx
-	movq	%rcx, %rsi
-	movl	%eax, %edi
+	movq	%rax, (%rsp)
+	movl	%edx, 8(%rsp)
+	movl	4(%rsp), %esi
+	movl	(%rsp), %edi
 	addq	$24, %rsp
 .LCFI1:
 	jmp	bar

To me it looks like an unnecessary pessimization, the volatile there
I understand that just it doesn't want to be moved around (e.g. scheduled
before the foo call or after bar), that it can't be DCEd, but as it doesn't
mention it modifies memory, I don't understand why it should force some
registers to stack and back when it has no way to know the compiler would be
emitting anything like that at all.
Compare that to
  asm volatile ("" : "=r" (x) : : "memory");
in the testcase, where the r193802 commit makes no difference, the
stores/loads are done in both cases.

For CSE, I'd agree it should treat asm volatile as barriers, so for cselib.c
we probably need some additional cselib_init flag how we want to treat
volatile asms...  For UNSPEC_VOLATILE, perhaps even DSE should stay
conservative, but for var-tracking.c I still don't see a reason for that.

	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 12:04         ` Jakub Jelinek
                             ` (2 preceding siblings ...)
  2012-11-27 17:39           ` Eric Botcazou
@ 2012-11-28 13:44           ` Hans-Peter Nilsson
  2012-11-28 13:56             ` Jakub Jelinek
  2012-11-28 20:53           ` Richard Henderson
  4 siblings, 1 reply; 19+ messages in thread
From: Hans-Peter Nilsson @ 2012-11-28 13:44 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva

On Tue, 27 Nov 2012, Jakub Jelinek wrote:
> 2012-11-26  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR debug/36728
> 	PR debug/55467
> 	* cselib.c (cselib_process_insn): If cselib_preserve_constants,
> 	don't reset table and exit early on volatile insns and setjmp.
> 	Reset table afterwards on setjmp.

It seems this also fixes the s390x-linux bootstrap; at least the
test-case in PR bootstrap/55511.  Thanks again.

(N.B. there may also be a bug in var-tracking, covered up by the
patch above.)

brgds, H-P

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-28 13:44           ` Hans-Peter Nilsson
@ 2012-11-28 13:56             ` Jakub Jelinek
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Jelinek @ 2012-11-28 13:56 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva

On Wed, Nov 28, 2012 at 08:44:18AM -0500, Hans-Peter Nilsson wrote:
> On Tue, 27 Nov 2012, Jakub Jelinek wrote:
> > 2012-11-26  Jakub Jelinek  <jakub@redhat.com>
> >
> > 	PR debug/36728
> > 	PR debug/55467
> > 	* cselib.c (cselib_process_insn): If cselib_preserve_constants,
> > 	don't reset table and exit early on volatile insns and setjmp.
> > 	Reset table afterwards on setjmp.
> 
> It seems this also fixes the s390x-linux bootstrap; at least the
> test-case in PR bootstrap/55511.  Thanks again.
> 
> (N.B. there may also be a bug in var-tracking, covered up by the
> patch above.)

The patch is actually a fix for that.  The thing is that
both cselib was doing the wrong thing for the resets (not calling
cselib_preserve_only_constants () before cselib_reset_table resp.
cselib_reset_table not prepared to the thing that it would need
to flush all regs/mems, not just from the VALUEs that are kept in the hash
table, but also from those that are dropped), and also not adding some
special magic microoperation for the volatile insns (unclear what exactly
it would have to do).  By never handling the volatile insns specially during
var-tracking all those issues are gone.

	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 17:39           ` Eric Botcazou
  2012-11-27 18:20             ` Jakub Jelinek
@ 2012-11-28 20:10             ` Richard Sandiford
  2012-11-28 20:46             ` Richard Henderson
  2 siblings, 0 replies; 19+ messages in thread
From: Richard Sandiford @ 2012-11-28 20:10 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Jakub Jelinek, gcc-patches, Hans-Peter Nilsson, Alexandre Oliva

Eric Botcazou <ebotcazou@adacore.com> writes:
>> I strongly disagree with this.  Outputs and clobbers have significant
>> meaning even on volatile asms, asm volatile doesn't mean all registers and
>> all memory are supposed to be considered clobbered.  For memory you have
>> "memory" clobber for that, for registers it is users responsibility to
>> describe exactly what changes, either in clobbers or in outputs.
>> The difference between non-volatile and volatile asm is, as the
>> documentation states:
>> 
>> The `volatile' keyword indicates that the instruction has important
>> side-effects.  GCC will not delete a volatile `asm' if it is reachable.
>> 
>> Volatile asm acts as an optimization barrier to some extent, but it really
>> can't modify registers or memory that aren't described as modified in the
>> asm pattern.  The important side-effects are of some other kind than
>> modifying registers or memory visible from the current function.
>> Ditto for UNSPEC_VOLATILE.
>
> Well, the last sentence would essentially void the entire argument I
> think.  It's well established in the RTL middle-end that
> UNSPEC_VOLATILE can do pretty much everything behind the back of the
> compiler.

As always when jumping in the middle of thread, I might well be missing
the point sorry, but this sounded a bit extreme if taken literally.
An UNSPEC_VOLATILE doesn't in itself force the function to save all
call-saved registers, so an UNSPEC_VOLATILE that modifies those registers
behind the back of the compiler would lead to us generating wrong code.
And I thought UNSPEC_VOLATILEs that also clobber visible memory in an
unpredictable way had to have something like (clobber (mem:BLK (scratch))).

I thought Jakub's "the important side-effects are of some other
kind than modifying registers or memory visible from the current
function" applied to UNSPEC_VOLATILE too.

Richard

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 17:39           ` Eric Botcazou
  2012-11-27 18:20             ` Jakub Jelinek
  2012-11-28 20:10             ` Richard Sandiford
@ 2012-11-28 20:46             ` Richard Henderson
  2 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2012-11-28 20:46 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Jakub Jelinek, gcc-patches, Hans-Peter Nilsson, Alexandre Oliva

On 11/27/2012 09:35 AM, Eric Botcazou wrote:
> It's well established in the RTL middle-end that UNSPEC_VOLATILE can do pretty 
> much everything behind the back of the compiler.

This is false.  U_V is a scheduling barrier, and is never deleted as
dead (as opposed to unreachable) code.  Certainly we don't do anything
so complete as invalidate all registers, nor do we invalidate memory.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 12:04         ` Jakub Jelinek
                             ` (3 preceding siblings ...)
  2012-11-28 13:44           ` Hans-Peter Nilsson
@ 2012-11-28 20:53           ` Richard Henderson
  4 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2012-11-28 20:53 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Hans-Peter Nilsson, Eric Botcazou, gcc-patches, Alexandre Oliva

On 11/27/2012 04:03 AM, Jakub Jelinek wrote:
> 2012-11-26  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR debug/36728
> 	PR debug/55467
> 	* cselib.c (cselib_process_insn): If cselib_preserve_constants,
> 	don't reset table and exit early on volatile insns and setjmp.
> 	Reset table afterwards on setjmp.
> 
> 	* gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm
> 	are non-empty and add dependency between the first and second asm.
> 	* gcc.dg/guality/pr36728-2.c: Likewise.
> 	* gcc.dg/guality/pr36728-3.c: New test.
> 	* gcc.dg/guality/pr36728-4.c: New test.

Ok.

It appears that PRs 55507 and 55511 are also fixed.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 18:20             ` Jakub Jelinek
@ 2012-12-01  0:03               ` Eric Botcazou
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Botcazou @ 2012-12-01  0:03 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Hans-Peter Nilsson, Alexandre Oliva

> I was mainly arguing with the sentence that for asm volatile, the output
> constraints (did you mean outputs and clobbers?) have essentially no
> meaning.  While for some optimizations perhaps it might be desirable
> to treat asm volatile as full optimization barrier, I'm not sure about all
> of them (i.e. it would be important to look for performance degradations
> caused by that change), and for var-tracking I'd argue that asm vs. asm
> volatile is completely unimportant, if the asm volatile pattern (or even
> UNSPEC_VOLATILE) doesn't say it clobbers or sets hard register XYZ, then it
> can't change it (well, it could but is responsible of restoring it),
> similarly for memory, if it doesn't clobber "memory" and doesn't set some
> memory, it can't do that.

Yes, I meant outputs and clobbers.  The origin of all this is:

`blockage'
     This pattern defines a pseudo insn that prevents the instruction
     scheduler and other passes from moving instructions and using
     register equivalences across the boundary defined by the blockage
     insn.  This needs to be an UNSPEC_VOLATILE pattern or a volatile
     ASM.

Now the typical blockage insn is that of the x86:

;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and
;; all of memory.  This blocks insns from being moved across this point.

(define_insn "blockage"
  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
  ""
  ""
  [(set_attr "length" "0")])

[Note the pretty adamant comment about UNSPEC_VOLATILE here...]

So, if UNSPEC_VOLATILE is not treated specially, the blockage insn of most 
ports won't block hard register equivalences.  And blockages are precisely 
used to do that, see e.g. the untyped_call pattern of the x86:

  /* The optimizer does not know that the call sets the function value
     registers we stored in the result block.  We avoid problems by
     claiming that all hard registers are used and clobbered at this
     point.  */
  emit_insn (gen_blockage ());

That being said, I agree that volatile asms are a separate discussion.

> So, do you object even to the var-tracking change (which would mean if
> var-tracking found that say some variable's value lives in hard register 12,
> when encountering asm volatile it would mean to forget about that even when
> hard register 12 isn't clobbered by the asm, nor set)?  For now
> var-tracking seems to be the most important issue, as we generate invalid
> debug info there (latent before, but never hit on inline asm on x86_64/i686
> except on setjmp calls).

For volatile asms, no, thanks for fixing the fallout on this side.

> As for other passes, the r193802 change e.g. regresses slightly modified
> pr44194-1.c testcase on x86_64,
> struct ints { int a, b, c; } foo();
> void bar(int a, int b);
> 
> void func() {
>   struct ints s = foo();
>   int x;
>   asm volatile ("" : "=r" (x));
>   bar(s.a, s.b);
> }
> 
>  func:
>  .LFB0:
> -	xorl	%eax, %eax
>  	subq	$24, %rsp
>  .LCFI0:
> +	xorl	%eax, %eax
>  	call	foo
> -	movq	%rax, %rcx
> -	shrq	$32, %rcx
> -	movq	%rcx, %rsi
> -	movl	%eax, %edi
> +	movq	%rax, (%rsp)
> +	movl	%edx, 8(%rsp)
> +	movl	4(%rsp), %esi
> +	movl	(%rsp), %edi
>  	addq	$24, %rsp
>  .LCFI1:
>  	jmp	bar
> 
> To me it looks like an unnecessary pessimization, the volatile there
> I understand that just it doesn't want to be moved around (e.g. scheduled
> before the foo call or after bar), that it can't be DCEd, but as it doesn't
> mention it modifies memory, I don't understand why it should force some
> registers to stack and back when it has no way to know the compiler would be
> emitting anything like that at all.
> Compare that to
>   asm volatile ("" : "=r" (x) : : "memory");
> in the testcase, where the r193802 commit makes no difference, the
> stores/loads are done in both cases.

OK, I agree that the fallout for DSE, and the effect on memory in general, is 
undesirable, especially for volatile asms.

> For CSE, I'd agree it should treat asm volatile as barriers, so for cselib.c
> we probably need some additional cselib_init flag how we want to treat
> volatile asms...  For UNSPEC_VOLATILE, perhaps even DSE should stay
> conservative, but for var-tracking.c I still don't see a reason for that.

Thank you (as well as the others) for the detailed feedback.  Clearly my 
initial stance on this was a bit extreme and needs to be watered down.
I'll ponder about this over the week-end and propose something next week.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
  2012-11-27 19:30 Uros Bizjak
@ 2012-11-28  9:50 ` Uros Bizjak
  0 siblings, 0 replies; 19+ messages in thread
From: Uros Bizjak @ 2012-11-28  9:50 UTC (permalink / raw)
  To: gcc-patches
  Cc: Jakub Jelinek, Hans-Peter Nilsson, Eric Botcazou, Alexandre Oliva

On Tue, Nov 27, 2012 at 8:29 PM, Uros Bizjak <ubizjak@gmail.com> wrote:

>> As long as volatile asms and UNSPEC_VOLATILE insns (aka.
>> barriers) are handled the same way and consistently throughout
>> gcc, I'm fine.  It seems your patch does that, so thanks!
>>
>> > But the question is also what effects your patch can have e.g. on RTL DSE.
>>
>> Looks like the patch caused a bootstrap for s390x.
>>
>> Eagerly awaiting a PR for that, but whoever is interested
>> on that, please try Jakub's patch first...
>>
>> > 2012-11-26  Jakub Jelinek  <jakub@redhat.com>
>> >
>> >     PR debug/36728
>> >     PR debug/55467
>> >     * cselib.c (cselib_process_insn): If cselib_preserve_constants,
>> >     don't reset table and exit early on volatile insns and setjmp.
>> >     Reset table afterwards on setjmp.
>> >
>> >     * gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm
>> >     are non-empty and add dependency between the first and second asm.
>> >     * gcc.dg/guality/pr36728-2.c: Likewise.
>> >     * gcc.dg/guality/pr36728-3.c: New test.
>> >     * gcc.dg/guality/pr36728-4.c: New test.
>
> I have hit the same ICE on alpha-linux-gnu bootstrap. Jakub's patch
> fixes the ICE, and allows bootstrap to pass well into stage2 now.
> However, it takes ~10 hours for full bootstrap+regtest to finish, will
> report back tomorrow morning (CET).

The results look OK [1]. Please note that I didn't patch the testsuite.

[1] http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg02335.html

Uros.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
@ 2012-11-27 19:30 Uros Bizjak
  2012-11-28  9:50 ` Uros Bizjak
  0 siblings, 1 reply; 19+ messages in thread
From: Uros Bizjak @ 2012-11-27 19:30 UTC (permalink / raw)
  To: gcc-patches
  Cc: Jakub Jelinek, Hans-Peter Nilsson, Eric Botcazou, Alexandre Oliva

Hello!

> On Tue, 27 Nov 2012, Jakub Jelinek wrote:
> > On Tue, Nov 27, 2012 at 06:44:23AM -0500, Hans-Peter Nilsson wrote:
>
> JFTR: No I didn't, Eric wrote the below.  But, it made sense to me. :)
>
> > > > We apparently have a small conflict between the meaning of volatile asms with
> > > > operands at the source level and volatile_insn_p.  However, I think that the
> > > > latter interpretation is the correct one: if your asm statements have effects
> > > > that can be described, then you should use output constraints instead of
> > > > volatile; otherwise, you should use volatile and the output constraints have
> > > > essentially no meaning.
> >
> > I strongly disagree with this.
> > [...]
>
> As long as volatile asms and UNSPEC_VOLATILE insns (aka.
> barriers) are handled the same way and consistently throughout
> gcc, I'm fine.  It seems your patch does that, so thanks!
>
> > But the question is also what effects your patch can have e.g. on RTL DSE.
>
> Looks like the patch caused a bootstrap for s390x.
>
> Eagerly awaiting a PR for that, but whoever is interested
> on that, please try Jakub's patch first...
>
> > 2012-11-26  Jakub Jelinek  <jakub@redhat.com>
> >
> > 	PR debug/36728
> > 	PR debug/55467
> > 	* cselib.c (cselib_process_insn): If cselib_preserve_constants,
> > 	don't reset table and exit early on volatile insns and setjmp.
> > 	Reset table afterwards on setjmp.
> >
> > 	* gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm
> > 	are non-empty and add dependency between the first and second asm.
> > 	* gcc.dg/guality/pr36728-2.c: Likewise.
> > 	* gcc.dg/guality/pr36728-3.c: New test.
> > 	* gcc.dg/guality/pr36728-4.c: New test.

I have hit the same ICE on alpha-linux-gnu bootstrap. Jakub's patch
fixes the ICE, and allows bootstrap to pass well into stage2 now.
However, it takes ~10 hours for full bootstrap+regtest to finish, will
report back tomorrow morning (CET).

Uros.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-12-01  0:03 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-12  4:49 [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver Hans-Peter Nilsson
2012-11-12  8:28 ` Eric Botcazou
2012-11-19  3:54   ` Hans-Peter Nilsson
2012-11-19  9:35     ` Eric Botcazou
2012-11-26  3:30       ` Hans-Peter Nilsson
2012-11-27 11:44       ` Hans-Peter Nilsson
2012-11-27 12:04         ` Jakub Jelinek
2012-11-27 12:27           ` Hans-Peter Nilsson
2012-11-27 13:32           ` Jakub Jelinek
2012-11-27 17:39           ` Eric Botcazou
2012-11-27 18:20             ` Jakub Jelinek
2012-12-01  0:03               ` Eric Botcazou
2012-11-28 20:10             ` Richard Sandiford
2012-11-28 20:46             ` Richard Henderson
2012-11-28 13:44           ` Hans-Peter Nilsson
2012-11-28 13:56             ` Jakub Jelinek
2012-11-28 20:53           ` Richard Henderson
2012-11-27 19:30 Uros Bizjak
2012-11-28  9:50 ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).