* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
@ 2012-11-27 19:30 Uros Bizjak
2012-11-28 9:50 ` Uros Bizjak
0 siblings, 1 reply; 19+ messages in thread
From: Uros Bizjak @ 2012-11-27 19:30 UTC (permalink / raw)
To: gcc-patches
Cc: Jakub Jelinek, Hans-Peter Nilsson, Eric Botcazou, Alexandre Oliva
Hello!
> On Tue, 27 Nov 2012, Jakub Jelinek wrote:
> > On Tue, Nov 27, 2012 at 06:44:23AM -0500, Hans-Peter Nilsson wrote:
>
> JFTR: No I didn't, Eric wrote the below. But, it made sense to me. :)
>
> > > > We apparently have a small conflict between the meaning of volatile asms with
> > > > operands at the source level and volatile_insn_p. However, I think that the
> > > > latter interpretation is the correct one: if your asm statements have effects
> > > > that can be described, then you should use output constraints instead of
> > > > volatile; otherwise, you should use volatile and the output constraints have
> > > > essentially no meaning.
> >
> > I strongly disagree with this.
> > [...]
>
> As long as volatile asms and UNSPEC_VOLATILE insns (aka.
> barriers) are handled the same way and consistently throughout
> gcc, I'm fine. It seems your patch does that, so thanks!
>
> > But the question is also what effects your patch can have e.g. on RTL DSE.
>
> Looks like the patch caused a bootstrap for s390x.
>
> Eagerly awaiting a PR for that, but whoever is interested
> on that, please try Jakub's patch first...
>
> > 2012-11-26 Jakub Jelinek <jakub@redhat.com>
> >
> > PR debug/36728
> > PR debug/55467
> > * cselib.c (cselib_process_insn): If cselib_preserve_constants,
> > don't reset table and exit early on volatile insns and setjmp.
> > Reset table afterwards on setjmp.
> >
> > * gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm
> > are non-empty and add dependency between the first and second asm.
> > * gcc.dg/guality/pr36728-2.c: Likewise.
> > * gcc.dg/guality/pr36728-3.c: New test.
> > * gcc.dg/guality/pr36728-4.c: New test.
I have hit the same ICE on alpha-linux-gnu bootstrap. Jakub's patch
fixes the ICE, and allows bootstrap to pass well into stage2 now.
However, it takes ~10 hours for full bootstrap+regtest to finish, will
report back tomorrow morning (CET).
Uros.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 19:30 [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver Uros Bizjak @ 2012-11-28 9:50 ` Uros Bizjak 0 siblings, 0 replies; 19+ messages in thread From: Uros Bizjak @ 2012-11-28 9:50 UTC (permalink / raw) To: gcc-patches Cc: Jakub Jelinek, Hans-Peter Nilsson, Eric Botcazou, Alexandre Oliva On Tue, Nov 27, 2012 at 8:29 PM, Uros Bizjak <ubizjak@gmail.com> wrote: >> As long as volatile asms and UNSPEC_VOLATILE insns (aka. >> barriers) are handled the same way and consistently throughout >> gcc, I'm fine. It seems your patch does that, so thanks! >> >> > But the question is also what effects your patch can have e.g. on RTL DSE. >> >> Looks like the patch caused a bootstrap for s390x. >> >> Eagerly awaiting a PR for that, but whoever is interested >> on that, please try Jakub's patch first... >> >> > 2012-11-26 Jakub Jelinek <jakub@redhat.com> >> > >> > PR debug/36728 >> > PR debug/55467 >> > * cselib.c (cselib_process_insn): If cselib_preserve_constants, >> > don't reset table and exit early on volatile insns and setjmp. >> > Reset table afterwards on setjmp. >> > >> > * gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm >> > are non-empty and add dependency between the first and second asm. >> > * gcc.dg/guality/pr36728-2.c: Likewise. >> > * gcc.dg/guality/pr36728-3.c: New test. >> > * gcc.dg/guality/pr36728-4.c: New test. > > I have hit the same ICE on alpha-linux-gnu bootstrap. Jakub's patch > fixes the ICE, and allows bootstrap to pass well into stage2 now. > However, it takes ~10 hours for full bootstrap+regtest to finish, will > report back tomorrow morning (CET). The results look OK [1]. Please note that I didn't patch the testsuite. [1] http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg02335.html Uros. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver @ 2012-11-12 4:49 Hans-Peter Nilsson 2012-11-12 8:28 ` Eric Botcazou 0 siblings, 1 reply; 19+ messages in thread From: Hans-Peter Nilsson @ 2012-11-12 4:49 UTC (permalink / raw) To: gcc-patches The problem exposed in PR55030 (repeatable on x86_64-linux with -m32 at r192676) is that the fake-frame-pointer "frame" is replaced with the actual-frame-pointer "bp" in cse1, around the critical insn in __builtin_setjmp_receiver that restores their defined offset. The patch in PR55030/r192676 removed a clobber of the frame-pointer, and cse1 felt free to replace the former register with the latter, as in the following expansion of __builtin_setjmp_receiver (in builtins.c: expand_builtin_setjmp_receiver) from a reduced test-case, see the PR. (You might find the setup/restore code having an unexpected order; there are two __builtin_setjmps in series and the setup of the second one is storing the wrong value for the frame-pointer.) The pre-transformation dump in cse1 says: (code_label/s 13 51 14 3 2 "" [2 uses]) (note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 78 15 17 3 (set (reg/f:SI 20 frame) (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 17 78 56 3 (unspec_volatile [ (const_int 0 [0]) ] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage} (nil)) (insn 56 17 57 3 (set (reg/f:SI 74) (symbol_ref:SI ("chk_fail_buf") [flags 0x40] <var_decl 0x7ffff7512688 chk_fail_buf>)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) (insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8]) (reg/f:SI 20 frame)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) Before r192676 and after the reversion in r192701, there was/is that weird extra "(clobber (reg/f:SI 6 bp)" inserted before insn 17. But without that, we notice post-cse1-transformation, a change in insn 57: (code_label/s 13 51 14 3 2 "" [3 uses]) (note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 78 15 17 3 (set (reg/f:SI 20 frame) (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1 (nil)) (insn 17 78 56 3 (unspec_volatile [ (const_int 0 [0]) ] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage} (nil)) (insn 56 17 57 3 (set (reg/f:SI 74) (symbol_ref:SI ("chk_fail_buf") [flags 0x40] <var_decl 0x7ffff7512688 chk_fail_buf>)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) (insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8]) (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal} (nil)) It seemed wrong for this change to be stopped *only* by that (clobber bp). Stepping through the code for insn 78 and 57, I was looking for related special-casing of frame- and other similar registers in cse.c and was a bit confused when I found none (not counting hash_rtx_cb). (I admit lack of special-casing is a good sign, though. :) Then I was blinded by the obvious; the next insn, insn 17, is a "blockage" insn. This is a target-specific blockage insn, but with the general form found in all targets defining it. The default blockage is an empty asm-volatile, which is what cse_insn recognized. The blockage insn is there to "prevent scheduling" of the critical insns and register values. It's almost obvious that an unspec_volatile should have that effect "too"; at least that's intended by the code in expand_builtin_setjmp_receiver. Luckily (or unluckily regarding the presence of the bug) *this* cse code is not run post-frame-layout (post-reload-cse does not use this code), or it seems people would soon notice register values used from the wrong side of the blockage, considering its critical use at the restore of the stack-pointer. As mentioned in the previous patch, <http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering the frame-pointer (and then using it) does not seem the logical alternative to the patch below; the blockage insn should just do its job. I updated comments and documentation so it's more apparent that register uses should also not be moved across the blockage; see the patched comments. Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after 192677. Ok to commit? gcc: PR middle-end/55030 * builtins.c (expand_builtin_setjmp_receiver): Update comment regarding purpose of blockage. * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for the head comment. * doc/md.texi (blockage): Update similarly. Change wording to require one of two forms, rather than implying a wider choice. * cse.c (cse_insn): Where checking for blocking insns, treat UNSPEC_VOLATILE as blocking, besides volatile ASM. Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 192677) +++ gcc/builtins.c (working copy) @@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece /* We must not allow the code we just generated to be reordered by scheduling. Specifically, the update of the frame pointer must - happen immediately, not later. */ + happen immediately, not later. Similarly, we must block + (frame-related) register values to be used across this code. */ emit_insn (gen_blockage ()); } Index: gcc/cse.c =================================================================== --- gcc/cse.c (revision 192677) +++ gcc/cse.c (working copy) @@ -5660,10 +5660,11 @@ cse_insn (rtx insn) invalidate (XEXP (dest, 0), GET_MODE (dest)); } - /* A volatile ASM invalidates everything. */ + /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ if (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) + && ((GET_CODE (PATTERN (insn)) == ASM_OPERANDS + && MEM_VOLATILE_P (PATTERN (insn))) + || GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE)) flush_hash_table (); /* Don't cse over a call to setjmp; on some machines (eg VAX) Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 192677) +++ gcc/emit-rtl.c (working copy) @@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset) #if !HAVE_blockage -/* Generate an empty ASM_INPUT, which is used to block attempts to schedule - across this insn. */ +/* Generate an empty ASM_INPUT, which is used to block attempts to schedule, + and to block register equivalences to be seen across this insn. */ rtx gen_blockage (void) Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 192677) +++ gcc/doc/md.texi (working copy) @@ -5832,8 +5832,9 @@ the values of operands 1 and 2. @item @samp{blockage} This pattern defines a pseudo insn that prevents the instruction -scheduler from moving instructions across the boundary defined by the -blockage insn. Normally an UNSPEC_VOLATILE pattern. +scheduler and other passes from moving instructions and using register +equivalences across the boundary defined by the blockage insn. +This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM. @cindex @code{memory_barrier} instruction pattern @item @samp{memory_barrier} brgds, H-P ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-12 4:49 Hans-Peter Nilsson @ 2012-11-12 8:28 ` Eric Botcazou 2012-11-19 3:54 ` Hans-Peter Nilsson 0 siblings, 1 reply; 19+ messages in thread From: Eric Botcazou @ 2012-11-12 8:28 UTC (permalink / raw) To: Hans-Peter Nilsson; +Cc: gcc-patches > This is a target-specific blockage insn, but with the general form > found in all targets defining it. The default blockage is an empty > asm-volatile, which is what cse_insn recognized. The blockage insn is > there to "prevent scheduling" of the critical insns and register > values. It's almost obvious that an unspec_volatile should have that > effect "too"; at least that's intended by the code in > expand_builtin_setjmp_receiver. Luckily (or unluckily regarding the > presence of the bug) *this* cse code is not run post-frame-layout > (post-reload-cse does not use this code), or it seems people would > soon notice register values used from the wrong side of the blockage, > considering its critical use at the restore of the stack-pointer. > As mentioned in the previous patch, > <http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering > the frame-pointer (and then using it) does not seem the logical > alternative to the patch below; the blockage insn should just do its job. Agreed. > I updated comments and documentation so it's more apparent that > register uses should also not be moved across the blockage; see the > patched comments. > > Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after > 192677. Ok to commit? > > gcc: > PR middle-end/55030 > * builtins.c (expand_builtin_setjmp_receiver): Update comment > regarding purpose of blockage. > * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for > the head comment. > * doc/md.texi (blockage): Update similarly. Change wording to > require one of two forms, rather than implying a wider choice. > * cse.c (cse_insn): Where checking for blocking insns, treat > UNSPEC_VOLATILE as blocking, besides volatile ASM. That's fine on principle, but there is a predicate for this (volatile_insn_p) so I think we should use it here. Moreover, cselib_process_insn has the same check so we should adjust it as well, which in turn means that dse.c:scan_insn should probably be adjusted too. OK with these changes, thanks. -- Eric Botcazou ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-12 8:28 ` Eric Botcazou @ 2012-11-19 3:54 ` Hans-Peter Nilsson 2012-11-19 9:35 ` Eric Botcazou 0 siblings, 1 reply; 19+ messages in thread From: Hans-Peter Nilsson @ 2012-11-19 3:54 UTC (permalink / raw) To: Eric Botcazou; +Cc: gcc-patches, Alexandre Oliva On Mon, 12 Nov 2012, Eric Botcazou wrote: > > This is a target-specific blockage insn, but with the general form > > found in all targets defining it. The default blockage is an empty > > asm-volatile, which is what cse_insn recognized. The blockage insn is > > there to "prevent scheduling" of the critical insns and register > > values. It's almost obvious that an unspec_volatile should have that > > effect "too"; at least that's intended by the code in > > expand_builtin_setjmp_receiver. Luckily (or unluckily regarding the > > presence of the bug) *this* cse code is not run post-frame-layout > > (post-reload-cse does not use this code), or it seems people would > > soon notice register values used from the wrong side of the blockage, > > considering its critical use at the restore of the stack-pointer. > > As mentioned in the previous patch, > > <http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering > > the frame-pointer (and then using it) does not seem the logical > > alternative to the patch below; the blockage insn should just do its job. > > Agreed. > > > I updated comments and documentation so it's more apparent that > > register uses should also not be moved across the blockage; see the > > patched comments. > > > > Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after > > 192677. Ok to commit? > > > > gcc: > > PR middle-end/55030 > > * builtins.c (expand_builtin_setjmp_receiver): Update comment > > regarding purpose of blockage. > > * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for > > the head comment. > > * doc/md.texi (blockage): Update similarly. Change wording to > > require one of two forms, rather than implying a wider choice. > > * cse.c (cse_insn): Where checking for blocking insns, treat > > UNSPEC_VOLATILE as blocking, besides volatile ASM. > > That's fine on principle, but there is a predicate for this (volatile_insn_p) > so I think we should use it here. Moreover, cselib_process_insn has the same > check so we should adjust it as well, which in turn means that dse.c:scan_insn > should probably be adjusted too. OK with these changes, thanks. Doh, I should have looked for that construct in other places. Thanks for the review. Here's an updated patch with that. Unfortunately, it causes regressions; read on for a very brief analysis. For x86_64-linux (base multilib): Running /home/hp/gcctop/tmp/pr55030-2n/gcc/gcc/testsuite/gcc.dg/guality/guality.exp ... ... FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg7 == 30 And for -m32: FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O1 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -fomit-frame-pointer line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O3 -g line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -Os line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -Os line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 12 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg2 == 2 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg3 == 3 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg4 == 4 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg5 == 5 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg6 == 6 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 14 arg7 == 30 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 12 arg1 == 1 FAIL: gcc.dg/guality/pr36728-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 14 arg1 == 1 I looked into gcc.dg/guality/pr36728-1.c at base -O1. The generated assembly code modulo debug info, is the same. The value of arg7 is optimized out, says gdb in gcc.log. The problem doesn't seem to be any md-generated frame-related-barrier as was my first guess, but the volatile asms in the source(!). The first one looks like this (and the second one similar) in pr36728-1.c.168r.cse1 (r193583): (insn 26 25 27 2 (parallel [ (set (mem/c:SI (plus:DI (reg/f:DI 20 frame) (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 A256]) (asm_operands/v:SI ("") ("=m") 0 [ (mem/c:SI (plus:DI (reg/f:DI 20 frame) (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 A256]) ] [ (asm_input:SI ("m") (null):0) ] [] /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12)) (clobber (reg:QI 18 fpsr)) (clobber (reg:QI 17 flags)) ]) /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12 -1 (nil)) It's not caught by the previous test: - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) ...but since it is a volatile asm (as seen in the source and by the /v on the asm_operands) volatile_insn_p does catch it (as expected and intended) and down the road for some reason, gcc can't recover the arg7 contents. Somewhat expected, but this volatile asm is also more volatile than intended; a clobber list for example as seen above inserted by the md, is now redundant. I'm not sure what to do with this. Try changing volatile_insn_p adding a parameter to optionally allow volatile asms with outputs to pass? But then, when *should* that distinction be done, to let such volatile asms be allowed to pass as not-really-as-volatile-as-we-look-for? I'd think "never" for any of the the patched code, or maybe "only when looking at effects on memory". Maybe there's some known gotcha in debug handling. I'm CC:ing Alexandre, in case he happens to have input to this off the top of his head (or has copious spare time :) PS. without the dse.c and cselib.c changes, it passed testing. gcc: PR middle-end/55030 * builtins.c (expand_builtin_setjmp_receiver): Update comment regarding purpose of blockage. * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for the head comment. * rtlanal.c (volatile_insn_p): Ditto. * doc/md.texi (blockage): Update similarly. Change wording to require one of two forms, rather than implying a wider choice. * cse.c (cse_insn): Where checking for blocking insns, use volatile_insn_p instead of manual check for volatile ASM. * dse.c (scan_insn): Ditto. * cselib.c (cselib_process_insn): Ditto. Index: gcc/dse.c =================================================================== --- gcc/dse.c (revision 192677) +++ gcc/dse.c (working copy) @@ -2522,8 +2522,7 @@ scan_insn (bb_info_t bb_info, rtx insn) /* Cselib clears the table for this case, so we have to essentially do the same. */ if (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) + && volatile_insn_p (PATTERN (insn))) { add_wild_read (bb_info); insn_info->cannot_delete = true; Index: gcc/rtlanal.c =================================================================== --- gcc/rtlanal.c (revision 192677) +++ gcc/rtlanal.c (working copy) @@ -2083,8 +2083,8 @@ remove_node_from_expr_list (const_rtx no /* Nonzero if X contains any volatile instructions. These are instructions which may cause unpredictable machine state instructions, and thus no - instructions should be moved or combined across them. This includes - only volatile asms and UNSPEC_VOLATILE instructions. */ + instructions or register uses should be moved or combined across them. + This includes only volatile asms and UNSPEC_VOLATILE instructions. */ int volatile_insn_p (const_rtx x) Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 192677) +++ gcc/builtins.c (working copy) @@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece /* We must not allow the code we just generated to be reordered by scheduling. Specifically, the update of the frame pointer must - happen immediately, not later. */ + happen immediately, not later. Similarly, we must block + (frame-related) register values to be used across this code. */ emit_insn (gen_blockage ()); } Index: gcc/cselib.c =================================================================== --- gcc/cselib.c (revision 192677) +++ gcc/cselib.c (working copy) @@ -2607,13 +2607,12 @@ cselib_process_insn (rtx insn) cselib_current_insn = insn; - /* Forget everything at a CODE_LABEL, a volatile asm, or a setjmp. */ + /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp. */ if (LABEL_P (insn) || (CALL_P (insn) && find_reg_note (insn, REG_SETJMP, NULL)) || (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn)))) + && volatile_insn_p (PATTERN (insn)))) { cselib_reset_table (next_uid); cselib_current_insn = NULL_RTX; Index: gcc/cse.c =================================================================== --- gcc/cse.c (revision 192677) +++ gcc/cse.c (working copy) @@ -5660,10 +5660,9 @@ cse_insn (rtx insn) invalidate (XEXP (dest, 0), GET_MODE (dest)); } - /* A volatile ASM invalidates everything. */ + /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ if (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) + && volatile_insn_p (PATTERN (insn))) flush_hash_table (); /* Don't cse over a call to setjmp; on some machines (eg VAX) Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 192677) +++ gcc/emit-rtl.c (working copy) @@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset) #if !HAVE_blockage -/* Generate an empty ASM_INPUT, which is used to block attempts to schedule - across this insn. */ +/* Generate an empty ASM_INPUT, which is used to block attempts to schedule, + and to block register equivalences to be seen across this insn. */ rtx gen_blockage (void) Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 192677) +++ gcc/doc/md.texi (working copy) @@ -5832,8 +5832,9 @@ the values of operands 1 and 2. @item @samp{blockage} This pattern defines a pseudo insn that prevents the instruction -scheduler from moving instructions across the boundary defined by the -blockage insn. Normally an UNSPEC_VOLATILE pattern. +scheduler and other passes from moving instructions and using register +equivalences across the boundary defined by the blockage insn. +This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM. @cindex @code{memory_barrier} instruction pattern @item @samp{memory_barrier} brgds, H-P ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-19 3:54 ` Hans-Peter Nilsson @ 2012-11-19 9:35 ` Eric Botcazou 2012-11-26 3:30 ` Hans-Peter Nilsson 2012-11-27 11:44 ` Hans-Peter Nilsson 0 siblings, 2 replies; 19+ messages in thread From: Eric Botcazou @ 2012-11-19 9:35 UTC (permalink / raw) To: Hans-Peter Nilsson; +Cc: gcc-patches, Alexandre Oliva, Jakub Jelinek [-- Attachment #1: Type: text/plain, Size: 4251 bytes --] > Unfortunately, it causes regressions; read on for a very brief > analysis. > > For x86_64-linux (base multilib): > > Running > /home/hp/gcctop/tmp/pr55030-2n/gcc/gcc/testsuite/gcc.dg/guality/guality.exp > ... ... > FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg7 == 30 > [...] > > I looked into gcc.dg/guality/pr36728-1.c at base -O1. > > The generated assembly code modulo debug info, is the same. The > value of arg7 is optimized out, says gdb in gcc.log. The > problem doesn't seem to be any md-generated > frame-related-barrier as was my first guess, but the volatile > asms in the source(!). The first one looks like this (and the > second one similar) in pr36728-1.c.168r.cse1 (r193583): > > (insn 26 25 27 2 (parallel [ > (set (mem/c:SI (plus:DI (reg/f:DI 20 frame) > (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 > A256]) (asm_operands/v:SI ("") ("=m") 0 [ > (mem/c:SI (plus:DI (reg/f:DI 20 frame) > (const_int -32 [0xffffffffffffffe0])) [0 y+0 > S4 A256]) ] > [ > (asm_input:SI ("m") (null):0) > ] > [] > /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12)) > (clobber (reg:QI 18 fpsr)) > (clobber (reg:QI 17 flags)) > ]) > /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12 > -1 (nil)) > > It's not caught by the previous test: > - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS > - && MEM_VOLATILE_P (PATTERN (insn))) > > ...but since it is a volatile asm (as seen in the source and by > the /v on the asm_operands) volatile_insn_p does catch it (as > expected and intended) and down the road for some reason, gcc > can't recover the arg7 contents. Somewhat expected, but this > volatile asm is also more volatile than intended; a clobber list > for example as seen above inserted by the md, is now redundant. Thanks for the analysis. I don't think that the redundancy of the clobber list matters here: the clobbers are added to all asm statements, volatile or not, so they aren't intended to make the volatile statements more precise in the hope of optimizing around them. > I'm not sure what to do with this. Try changing volatile_insn_p > adding a parameter to optionally allow volatile asms with > outputs to pass? But then, when *should* that distinction be > done, to let such volatile asms be allowed to pass as > not-really-as-volatile-as-we-look-for? I'd think "never" for > any of the the patched code, or maybe "only when looking at > effects on memory". Yes, weakening volatile_insn_p seems also dangerous to me, and I'd rather lean towards the opposite, conservative side. We apparently have a small conflict between the meaning of volatile asms with operands at the source level and volatile_insn_p. However, I think that the latter interpretation is the correct one: if your asm statements have effects that can be described, then you should use output constraints instead of volatile; otherwise, you should use volatile and the output constraints have essentially no meaning. The gcc.dg/guality/pr36728-[12].c testcases use volatile asms in an artificial way to coax the compiler into following a certain behaviour, so I don't think that they should be taken into account to judge the correctness of the change. Therefore, I'd go ahead and apply the full patch below, possibly adjusting both testcases to cope with it. Tentative (and untested) patch attached. I've also CCed Jakub though, as he might have a different opinion. > gcc: > PR middle-end/55030 > * builtins.c (expand_builtin_setjmp_receiver): Update comment > regarding purpose of blockage. > * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for > the head comment. > * rtlanal.c (volatile_insn_p): Ditto. > * doc/md.texi (blockage): Update similarly. Change wording to > require one of two forms, rather than implying a wider choice. > * cse.c (cse_insn): Where checking for blocking insns, use > volatile_insn_p instead of manual check for volatile ASM. > * dse.c (scan_insn): Ditto. > * cselib.c (cselib_process_insn): Ditto. -- Eric Botcazou [-- Attachment #2: p.diff --] [-- Type: text/x-patch, Size: 2027 bytes --] Index: gcc.dg/guality/pr36728-1.c =================================================================== --- gcc.dg/guality/pr36728-1.c (revision 193596) +++ gcc.dg/guality/pr36728-1.c (working copy) @@ -9,10 +9,10 @@ foo (int arg1, int arg2, int arg3, int a int __attribute__ ((aligned(32))) y; y = 2; - asm volatile ("" : "=m" (y) : "m" (y)); + asm ("" : "=m" (y) : "m" (y)); x[0] = 25; - asm volatile ("" : "=m" (x[0]) : "m" (x[0])); - return y; + asm ("" : "=m" (x[0]) : "m" (x[0])); + return y + x[0]; } /* On s390(x) r2 and r3 are (depending on the optimization level) used @@ -39,11 +39,13 @@ foo (int arg1, int arg2, int arg3, int a /* { dg-final { gdb-test 14 "*x" "(char) 25" } } */ /* { dg-final { gdb-test 14 "y" "2" } } */ +int a; + int main () { int l = 0; - asm volatile ("" : "=r" (l) : "0" (l)); - foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); + asm ("" : "=r" (l) : "0" (l)); + a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); return 0; } Index: gcc.dg/guality/pr36728-2.c =================================================================== --- gcc.dg/guality/pr36728-2.c (revision 193596) +++ gcc.dg/guality/pr36728-2.c (working copy) @@ -9,10 +9,10 @@ foo (int arg1, int arg2, int arg3, int a int __attribute__ ((aligned(32))) y; y = 2; - asm volatile ("" : "=m" (y) : "m" (y)); + asm ("" : "=m" (y) : "m" (y)); x[0] = 25; - asm volatile ("" : "=m" (x[0]) : "m" (x[0])); - return y; + asm ("" : "=m" (x[0]) : "m" (x[0])); + return y + x[0]; } /* On s390(x) r2 and r3 are (depending on the optimization level) used @@ -39,11 +39,13 @@ foo (int arg1, int arg2, int arg3, int a /* { dg-final { gdb-test 14 "*x" "(char) 25" } } */ /* { dg-final { gdb-test 14 "y" "2" } } */ +int a; + int main () { int l = 0; - asm volatile ("" : "=r" (l) : "0" (l)); - foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); + asm ("" : "=r" (l) : "0" (l)); + a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); return 0; } ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-19 9:35 ` Eric Botcazou @ 2012-11-26 3:30 ` Hans-Peter Nilsson 2012-11-27 11:44 ` Hans-Peter Nilsson 1 sibling, 0 replies; 19+ messages in thread From: Hans-Peter Nilsson @ 2012-11-26 3:30 UTC (permalink / raw) To: Eric Botcazou; +Cc: gcc-patches, Alexandre Oliva, Jakub Jelinek Thanks for the reviews! On Mon, 19 Nov 2012, Eric Botcazou wrote: > Thanks for the analysis. I don't think that the redundancy of the clobber > list matters here: the clobbers are added to all asm statements, volatile or > not, so they aren't intended to make the volatile statements more precise in > the hope of optimizing around them. Oh, right. > > I'm not sure what to do with this. Try changing volatile_insn_p > > adding a parameter to optionally allow volatile asms with > > outputs to pass? But then, when *should* that distinction be > > done, to let such volatile asms be allowed to pass as > > not-really-as-volatile-as-we-look-for? I'd think "never" for > > any of the the patched code, or maybe "only when looking at > > effects on memory". > > Yes, weakening volatile_insn_p seems also dangerous to me, and I'd rather lean > towards the opposite, conservative side. > > We apparently have a small conflict between the meaning of volatile asms with > operands at the source level and volatile_insn_p. However, I think that the > latter interpretation is the correct one: if your asm statements have effects > that can be described, then you should use output constraints instead of > volatile; otherwise, you should use volatile and the output constraints have > essentially no meaning. > > The gcc.dg/guality/pr36728-[12].c testcases use volatile asms in an artificial > way to coax the compiler into following a certain behaviour, so I don't think > that they should be taken into account to judge the correctness of the change. > Therefore, I'd go ahead and apply the full patch below, possibly adjusting > both testcases to cope with it. Tentative (and untested) patch attached. > > I've also CCed Jakub though, as he might have a different opinion. Ok, no comments, so let's go with adjusting the test-cases. But, your patch for the test-cases doesn't work with (or without) the patch for pr36728-2.c. Bonus points for being very close, though! I don't really see *why* your patch shouldn't work. Specifically, I don't see why the *wrong value* is displayed instead of at least "<optimized out>" for any changed behavior, so I entered PR55467 for the related observations. By instead of your approach; not using x[0] after the asm, and instead changing the asm to also write the result to an external variable (i.e. an unknown operation writing there and x[0], with x[0] as input), I think I mimicked the intended behavior, such that x[0] is preserved as a variable until the line with the asm. At least nothing changes for -O1 besides debug info and the write to the variable "a" in main, for an unpatched gcc with/without patched test-case. (The patched test-cases all pass for an otherwise unpatched gcc at r193777.) Committed after committing the reapplied previous patch; r193802 and r193803. > > gcc: > > PR middle-end/55030 > > * builtins.c (expand_builtin_setjmp_receiver): Update comment > > regarding purpose of blockage. > > * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for > > the head comment. > > * rtlanal.c (volatile_insn_p): Ditto. > > * doc/md.texi (blockage): Update similarly. Change wording to > > require one of two forms, rather than implying a wider choice. > > * cse.c (cse_insn): Where checking for blocking insns, use > > volatile_insn_p instead of manual check for volatile ASM. > > * dse.c (scan_insn): Ditto. > > * cselib.c (cselib_process_insn): Ditto. testsuite: PR middle-end/55030 * gcc.dg/guality/pr36728-1.c, gcc.dg/guality/pr36728-2.c (foo): Don't use volatile asms, use plain asms. Where the output value for the asm is unused, write a global variable. Index: gcc/testsuite/gcc.dg/guality/pr36728-1.c =================================================================== --- gcc/testsuite/gcc.dg/guality/pr36728-1.c (revision 193776) +++ gcc/testsuite/gcc.dg/guality/pr36728-1.c (working copy) @@ -1,7 +1,7 @@ /* PR debug/36728 */ /* { dg-do run } */ /* { dg-options "-g" } */ - +int a; int __attribute__((noinline)) foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7) { @@ -9,9 +9,9 @@ foo (int arg1, int arg2, int arg3, int a int __attribute__ ((aligned(32))) y; y = 2; - asm volatile ("" : "=m" (y) : "m" (y)); + asm ("" : "=m" (y) : "m" (y)); x[0] = 25; - asm volatile ("" : "=m" (x[0]) : "m" (x[0])); + asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0])); return y; } @@ -43,7 +43,7 @@ int main () { int l = 0; - asm volatile ("" : "=r" (l) : "0" (l)); - foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); + asm ("" : "=r" (l) : "0" (l)); + a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); return 0; } Index: gcc/testsuite/gcc.dg/guality/pr36728-2.c =================================================================== --- gcc/testsuite/gcc.dg/guality/pr36728-2.c (revision 193776) +++ gcc/testsuite/gcc.dg/guality/pr36728-2.c (working copy) @@ -1,7 +1,7 @@ /* PR debug/36728 */ /* { dg-do run } */ /* { dg-options "-g" } */ - +int a; int __attribute__((noinline)) foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7) { @@ -9,9 +9,9 @@ foo (int arg1, int arg2, int arg3, int a int __attribute__ ((aligned(32))) y; y = 2; - asm volatile ("" : "=m" (y) : "m" (y)); + asm ("" : "=m" (y) : "m" (y)); x[0] = 25; - asm volatile ("" : "=m" (x[0]) : "m" (x[0])); + asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0])); return y; } @@ -43,7 +43,7 @@ int main () { int l = 0; - asm volatile ("" : "=r" (l) : "0" (l)); - foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); + asm ("" : "=r" (l) : "0" (l)); + a = foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); return 0; } ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-19 9:35 ` Eric Botcazou 2012-11-26 3:30 ` Hans-Peter Nilsson @ 2012-11-27 11:44 ` Hans-Peter Nilsson 2012-11-27 12:04 ` Jakub Jelinek 1 sibling, 1 reply; 19+ messages in thread From: Hans-Peter Nilsson @ 2012-11-27 11:44 UTC (permalink / raw) To: Eric Botcazou; +Cc: gcc-patches, Alexandre Oliva, Jakub Jelinek I quoted the whole discussion, see a single line below. On Mon, 19 Nov 2012, Eric Botcazou wrote: > > Unfortunately, it causes regressions; read on for a very brief > > analysis. > > > > For x86_64-linux (base multilib): > > > > Running > > /home/hp/gcctop/tmp/pr55030-2n/gcc/gcc/testsuite/gcc.dg/guality/guality.exp > > ... ... > > FAIL: gcc.dg/guality/pr36728-1.c -O1 line 12 arg7 == 30 > > [...] > > > > I looked into gcc.dg/guality/pr36728-1.c at base -O1. > > > > The generated assembly code modulo debug info, is the same. The > > value of arg7 is optimized out, says gdb in gcc.log. The > > problem doesn't seem to be any md-generated > > frame-related-barrier as was my first guess, but the volatile > > asms in the source(!). The first one looks like this (and the > > second one similar) in pr36728-1.c.168r.cse1 (r193583): > > > > (insn 26 25 27 2 (parallel [ > > (set (mem/c:SI (plus:DI (reg/f:DI 20 frame) > > (const_int -32 [0xffffffffffffffe0])) [0 y+0 S4 > > A256]) (asm_operands/v:SI ("") ("=m") 0 [ > > (mem/c:SI (plus:DI (reg/f:DI 20 frame) > > (const_int -32 [0xffffffffffffffe0])) [0 y+0 > > S4 A256]) ] > > [ > > (asm_input:SI ("m") (null):0) > > ] > > [] > > /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12)) > > (clobber (reg:QI 18 fpsr)) > > (clobber (reg:QI 17 flags)) > > ]) > > /home/hp/gcctop/tmp/nbasf/gcc/gcc/testsuite/gcc.dg/guality/pr36728-1.c:12 > > -1 (nil)) > > > > It's not caught by the previous test: > > - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS > > - && MEM_VOLATILE_P (PATTERN (insn))) > > > > ...but since it is a volatile asm (as seen in the source and by > > the /v on the asm_operands) volatile_insn_p does catch it (as > > expected and intended) and down the road for some reason, gcc > > can't recover the arg7 contents. Somewhat expected, but this > > volatile asm is also more volatile than intended; a clobber list > > for example as seen above inserted by the md, is now redundant. > > Thanks for the analysis. I don't think that the redundancy of the clobber > list matters here: the clobbers are added to all asm statements, volatile or > not, so they aren't intended to make the volatile statements more precise in > the hope of optimizing around them. IIIUC, Jakub disagrees on this point, see <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55467#c13>. Sigh, hoping we can get consensus on this point, or at least that gcc has a consistent view... > > I'm not sure what to do with this. Try changing volatile_insn_p > > adding a parameter to optionally allow volatile asms with > > outputs to pass? But then, when *should* that distinction be > > done, to let such volatile asms be allowed to pass as > > not-really-as-volatile-as-we-look-for? I'd think "never" for > > any of the the patched code, or maybe "only when looking at > > effects on memory". > > Yes, weakening volatile_insn_p seems also dangerous to me, and I'd rather lean > towards the opposite, conservative side. > > We apparently have a small conflict between the meaning of volatile asms with > operands at the source level and volatile_insn_p. However, I think that the > latter interpretation is the correct one: if your asm statements have effects > that can be described, then you should use output constraints instead of > volatile; otherwise, you should use volatile and the output constraints have > essentially no meaning. > > The gcc.dg/guality/pr36728-[12].c testcases use volatile asms in an artificial > way to coax the compiler into following a certain behaviour, so I don't think > that they should be taken into account to judge the correctness of the change. > Therefore, I'd go ahead and apply the full patch below, possibly adjusting > both testcases to cope with it. Tentative (and untested) patch attached. > > I've also CCed Jakub though, as he might have a different opinion. > > > gcc: > > PR middle-end/55030 > > * builtins.c (expand_builtin_setjmp_receiver): Update comment > > regarding purpose of blockage. > > * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for > > the head comment. > > * rtlanal.c (volatile_insn_p): Ditto. > > * doc/md.texi (blockage): Update similarly. Change wording to > > require one of two forms, rather than implying a wider choice. > > * cse.c (cse_insn): Where checking for blocking insns, use > > volatile_insn_p instead of manual check for volatile ASM. > > * dse.c (scan_insn): Ditto. > > * cselib.c (cselib_process_insn): Ditto. > > -- > Eric Botcazou ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 11:44 ` Hans-Peter Nilsson @ 2012-11-27 12:04 ` Jakub Jelinek 2012-11-27 12:27 ` Hans-Peter Nilsson ` (4 more replies) 0 siblings, 5 replies; 19+ messages in thread From: Jakub Jelinek @ 2012-11-27 12:04 UTC (permalink / raw) To: Hans-Peter Nilsson; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva On Tue, Nov 27, 2012 at 06:44:23AM -0500, Hans-Peter Nilsson wrote: > > We apparently have a small conflict between the meaning of volatile asms with > > operands at the source level and volatile_insn_p. However, I think that the > > latter interpretation is the correct one: if your asm statements have effects > > that can be described, then you should use output constraints instead of > > volatile; otherwise, you should use volatile and the output constraints have > > essentially no meaning. I strongly disagree with this. Outputs and clobbers have significant meaning even on volatile asms, asm volatile doesn't mean all registers and all memory are supposed to be considered clobbered. For memory you have "memory" clobber for that, for registers it is users responsibility to describe exactly what changes, either in clobbers or in outputs. The difference between non-volatile and volatile asm is, as the documentation states: The `volatile' keyword indicates that the instruction has important side-effects. GCC will not delete a volatile `asm' if it is reachable. Volatile asm acts as an optimization barrier to some extent, but it really can't modify registers or memory that aren't described as modified in the asm pattern. The important side-effects are of some other kind than modifying registers or memory visible from the current function. Ditto for UNSPEC_VOLATILE. So, at least from var-tracking POV which doesn't attempt to perform any optimizations across any insn, just tries to track where values live, IMHO a volatile asm acts exactly the same as non-volatile, that is why I'm testing following patch right now. But the question is also what effects your patch can have e.g. on RTL DSE. 2012-11-26 Jakub Jelinek <jakub@redhat.com> PR debug/36728 PR debug/55467 * cselib.c (cselib_process_insn): If cselib_preserve_constants, don't reset table and exit early on volatile insns and setjmp. Reset table afterwards on setjmp. * gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm are non-empty and add dependency between the first and second asm. * gcc.dg/guality/pr36728-2.c: Likewise. * gcc.dg/guality/pr36728-3.c: New test. * gcc.dg/guality/pr36728-4.c: New test. --- gcc/cselib.c.jj 2012-11-26 10:14:26.000000000 +0100 +++ gcc/cselib.c 2012-11-26 20:01:10.488304558 +0100 @@ -2626,11 +2626,12 @@ cselib_process_insn (rtx insn) cselib_current_insn = insn; /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp. */ - if (LABEL_P (insn) - || (CALL_P (insn) - && find_reg_note (insn, REG_SETJMP, NULL)) - || (NONJUMP_INSN_P (insn) - && volatile_insn_p (PATTERN (insn)))) + if ((LABEL_P (insn) + || (CALL_P (insn) + && find_reg_note (insn, REG_SETJMP, NULL)) + || (NONJUMP_INSN_P (insn) + && volatile_insn_p (PATTERN (insn)))) + && !cselib_preserve_constants) { cselib_reset_table (next_uid); cselib_current_insn = NULL_RTX; @@ -2668,9 +2669,18 @@ cselib_process_insn (rtx insn) /* Look for any CLOBBERs in CALL_INSN_FUNCTION_USAGE, but only after we have processed the insn. */ if (CALL_P (insn)) - for (x = CALL_INSN_FUNCTION_USAGE (insn); x; x = XEXP (x, 1)) - if (GET_CODE (XEXP (x, 0)) == CLOBBER) - cselib_invalidate_rtx (XEXP (XEXP (x, 0), 0)); + { + for (x = CALL_INSN_FUNCTION_USAGE (insn); x; x = XEXP (x, 1)) + if (GET_CODE (XEXP (x, 0)) == CLOBBER) + cselib_invalidate_rtx (XEXP (XEXP (x, 0), 0)); + /* Flush evertything on setjmp. */ + if (cselib_preserve_constants + && find_reg_note (insn, REG_SETJMP, NULL)) + { + cselib_preserve_only_values (); + cselib_reset_table (next_uid); + } + } /* On setter of the hard frame pointer if frame_pointer_needed, invalidate stack_pointer_rtx, so that sp and {,h}fp based --- gcc/testsuite/gcc.dg/guality/pr36728-1.c.jj 2012-11-26 10:14:25.000000000 +0100 +++ gcc/testsuite/gcc.dg/guality/pr36728-1.c 2012-11-27 10:01:14.517080157 +0100 @@ -1,7 +1,11 @@ /* PR debug/36728 */ /* { dg-do run } */ /* { dg-options "-g" } */ -int a; + +#include "../nop.h" + +int a, b; + int __attribute__((noinline)) foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7) { @@ -9,9 +13,9 @@ foo (int arg1, int arg2, int arg3, int a int __attribute__ ((aligned(32))) y; y = 2; - asm ("" : "=m" (y) : "m" (y)); + asm (NOP : "=m" (y), "=m" (b) : "m" (y)); x[0] = 25; - asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0])); + asm (NOP : "=m" (x[0]), "=m" (a) : "m" (x[0]), "m" (b)); return y; } @@ -21,23 +25,23 @@ foo (int arg1, int arg2, int arg3, int a and arg2. So it is expected that these values are unavailable in some of these tests. */ -/* { dg-final { gdb-test 12 "arg1" "1" { target { ! "s390*-*-*" } } } }*/ -/* { dg-final { gdb-test 12 "arg2" "2" { target { ! "s390*-*-*" } } } }*/ -/* { dg-final { gdb-test 12 "arg3" "3" } } */ -/* { dg-final { gdb-test 12 "arg4" "4" } } */ -/* { dg-final { gdb-test 12 "arg5" "5" } } */ -/* { dg-final { gdb-test 12 "arg6" "6" } } */ -/* { dg-final { gdb-test 12 "arg7" "30" } } */ -/* { dg-final { gdb-test 12 "y" "2" } } */ -/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } }*/ -/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } }*/ -/* { dg-final { gdb-test 14 "arg3" "3" } } */ -/* { dg-final { gdb-test 14 "arg4" "4" } } */ -/* { dg-final { gdb-test 14 "arg5" "5" } } */ -/* { dg-final { gdb-test 14 "arg6" "6" } } */ -/* { dg-final { gdb-test 14 "arg7" "30" } } */ -/* { dg-final { gdb-test 14 "*x" "(char) 25" } } */ -/* { dg-final { gdb-test 14 "y" "2" } } */ +/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg3" "3" } } */ +/* { dg-final { gdb-test 16 "arg4" "4" } } */ +/* { dg-final { gdb-test 16 "arg5" "5" } } */ +/* { dg-final { gdb-test 16 "arg6" "6" } } */ +/* { dg-final { gdb-test 16 "arg7" "30" } } */ +/* { dg-final { gdb-test 16 "y" "2" } } */ +/* { dg-final { gdb-test 18 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 18 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 18 "arg3" "3" } } */ +/* { dg-final { gdb-test 18 "arg4" "4" } } */ +/* { dg-final { gdb-test 18 "arg5" "5" } } */ +/* { dg-final { gdb-test 18 "arg6" "6" } } */ +/* { dg-final { gdb-test 18 "arg7" "30" } } */ +/* { dg-final { gdb-test 18 "*x" "(char) 25" } } */ +/* { dg-final { gdb-test 18 "y" "2" } } */ int main () --- gcc/testsuite/gcc.dg/guality/pr36728-2.c.jj 2012-11-26 10:14:25.000000000 +0100 +++ gcc/testsuite/gcc.dg/guality/pr36728-2.c 2012-11-27 10:00:46.237254022 +0100 @@ -1,7 +1,11 @@ /* PR debug/36728 */ /* { dg-do run } */ /* { dg-options "-g" } */ -int a; + +#include "../nop.h" + +int a, b; + int __attribute__((noinline)) foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7) { @@ -9,9 +13,9 @@ foo (int arg1, int arg2, int arg3, int a int __attribute__ ((aligned(32))) y; y = 2; - asm ("" : "=m" (y) : "m" (y)); + asm (NOP : "=m" (y), "=m" (b) : "m" (y)); x[0] = 25; - asm ("" : "=m" (x[0]), "=m" (a) : "m" (x[0])); + asm (NOP : "=m" (x[0]), "=m" (a) : "m" (x[0]), "m" (b)); return y; } @@ -21,23 +25,23 @@ foo (int arg1, int arg2, int arg3, int a and arg2. So it is expected that these values are unavailable in some of these tests. */ -/* { dg-final { gdb-test 12 "arg1" "1" { target { ! "s390*-*-*" } } } } */ -/* { dg-final { gdb-test 12 "arg2" "2" { target { ! "s390*-*-*" } } } } */ -/* { dg-final { gdb-test 12 "arg3" "3" } } */ -/* { dg-final { gdb-test 12 "arg4" "4" } } */ -/* { dg-final { gdb-test 12 "arg5" "5" } } */ -/* { dg-final { gdb-test 12 "arg6" "6" } } */ -/* { dg-final { gdb-test 12 "arg7" "30" } } */ -/* { dg-final { gdb-test 12 "y" "2" } } */ -/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } } */ -/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } } */ -/* { dg-final { gdb-test 14 "arg3" "3" } } */ -/* { dg-final { gdb-test 14 "arg4" "4" } } */ -/* { dg-final { gdb-test 14 "arg5" "5" } } */ -/* { dg-final { gdb-test 14 "arg6" "6" } } */ -/* { dg-final { gdb-test 14 "arg7" "30" } } */ -/* { dg-final { gdb-test 14 "*x" "(char) 25" } } */ -/* { dg-final { gdb-test 14 "y" "2" } } */ +/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg3" "3" } } */ +/* { dg-final { gdb-test 16 "arg4" "4" } } */ +/* { dg-final { gdb-test 16 "arg5" "5" } } */ +/* { dg-final { gdb-test 16 "arg6" "6" } } */ +/* { dg-final { gdb-test 16 "arg7" "30" } } */ +/* { dg-final { gdb-test 16 "y" "2" } } */ +/* { dg-final { gdb-test 18 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 18 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 18 "arg3" "3" } } */ +/* { dg-final { gdb-test 18 "arg4" "4" } } */ +/* { dg-final { gdb-test 18 "arg5" "5" } } */ +/* { dg-final { gdb-test 18 "arg6" "6" } } */ +/* { dg-final { gdb-test 18 "arg7" "30" } } */ +/* { dg-final { gdb-test 18 "*x" "(char) 25" } } */ +/* { dg-final { gdb-test 18 "y" "2" } } */ int main () --- gcc/testsuite/gcc.dg/guality/pr36728-3.c.jj 2012-11-26 18:58:40.896599886 +0100 +++ gcc/testsuite/gcc.dg/guality/pr36728-3.c 2012-11-27 10:01:45.455892012 +0100 @@ -0,0 +1,51 @@ +/* PR debug/36728 */ +/* { dg-do run } */ +/* { dg-options "-g" } */ + +#include "../nop.h" + +int __attribute__((noinline)) +foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7) +{ + char *x = __builtin_alloca (arg7); + int __attribute__ ((aligned(32))) y; + + y = 2; + asm (NOP : "=m" (y) : "m" (y)); + x[0] = 25; + asm volatile (NOP : "=m" (x[0]) : "m" (x[0])); + return y; +} + +/* On s390(x) r2 and r3 are (depending on the optimization level) used + when adjusting the addresses in order to meet the alignment + requirements above. They usually hold the function arguments arg1 + and arg2. So it is expected that these values are unavailable in + some of these tests. */ + +/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 14 "arg3" "3" } } */ +/* { dg-final { gdb-test 14 "arg4" "4" } } */ +/* { dg-final { gdb-test 14 "arg5" "5" } } */ +/* { dg-final { gdb-test 14 "arg6" "6" } } */ +/* { dg-final { gdb-test 14 "arg7" "30" } } */ +/* { dg-final { gdb-test 14 "y" "2" } } */ +/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg3" "3" } } */ +/* { dg-final { gdb-test 16 "arg4" "4" } } */ +/* { dg-final { gdb-test 16 "arg5" "5" } } */ +/* { dg-final { gdb-test 16 "arg6" "6" } } */ +/* { dg-final { gdb-test 16 "arg7" "30" } } */ +/* { dg-final { gdb-test 16 "*x" "(char) 25" } } */ +/* { dg-final { gdb-test 16 "y" "2" } } */ + +int +main () +{ + int l = 0; + asm volatile ("" : "=r" (l) : "0" (l)); + foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); + return 0; +} --- gcc/testsuite/gcc.dg/guality/pr36728-4.c.jj 2012-11-26 18:58:44.624580656 +0100 +++ gcc/testsuite/gcc.dg/guality/pr36728-4.c 2012-11-26 14:56:12.000000000 +0100 @@ -0,0 +1,51 @@ +/* PR debug/36728 */ +/* { dg-do run } */ +/* { dg-options "-g" } */ + +#include "../nop.h" + +int __attribute__((noinline)) +foo (int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7) +{ + char x[30]; + int __attribute__ ((aligned(32))) y; + + y = 2; + asm (NOP : "=m" (y) : "m" (y)); + x[0] = 25; + asm volatile (NOP : "=m" (x[0]) : "m" (x[0])); + return y; +} + +/* On s390(x) r2 and r3 are (depending on the optimization level) used + when adjusting the addresses in order to meet the alignment + requirements above. They usually hold the function arguments arg1 + and arg2. So it is expected that these values are unavailable in + some of these tests. */ + +/* { dg-final { gdb-test 14 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 14 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 14 "arg3" "3" } } */ +/* { dg-final { gdb-test 14 "arg4" "4" } } */ +/* { dg-final { gdb-test 14 "arg5" "5" } } */ +/* { dg-final { gdb-test 14 "arg6" "6" } } */ +/* { dg-final { gdb-test 14 "arg7" "30" } } */ +/* { dg-final { gdb-test 14 "y" "2" } } */ +/* { dg-final { gdb-test 16 "arg1" "1" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg2" "2" { target { ! "s390*-*-*" } } } } */ +/* { dg-final { gdb-test 16 "arg3" "3" } } */ +/* { dg-final { gdb-test 16 "arg4" "4" } } */ +/* { dg-final { gdb-test 16 "arg5" "5" } } */ +/* { dg-final { gdb-test 16 "arg6" "6" } } */ +/* { dg-final { gdb-test 16 "arg7" "30" } } */ +/* { dg-final { gdb-test 16 "*x" "(char) 25" } } */ +/* { dg-final { gdb-test 16 "y" "2" } } */ + +int +main () +{ + int l = 0; + asm volatile ("" : "=r" (l) : "0" (l)); + foo (l + 1, l + 2, l + 3, l + 4, l + 5, l + 6, l + 30); + return 0; +} Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 12:04 ` Jakub Jelinek @ 2012-11-27 12:27 ` Hans-Peter Nilsson 2012-11-27 13:32 ` Jakub Jelinek ` (3 subsequent siblings) 4 siblings, 0 replies; 19+ messages in thread From: Hans-Peter Nilsson @ 2012-11-27 12:27 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva On Tue, 27 Nov 2012, Jakub Jelinek wrote: > On Tue, Nov 27, 2012 at 06:44:23AM -0500, Hans-Peter Nilsson wrote: JFTR: No I didn't, Eric wrote the below. But, it made sense to me. :) > > > We apparently have a small conflict between the meaning of volatile asms with > > > operands at the source level and volatile_insn_p. However, I think that the > > > latter interpretation is the correct one: if your asm statements have effects > > > that can be described, then you should use output constraints instead of > > > volatile; otherwise, you should use volatile and the output constraints have > > > essentially no meaning. > > I strongly disagree with this. > [...] As long as volatile asms and UNSPEC_VOLATILE insns (aka. barriers) are handled the same way and consistently throughout gcc, I'm fine. It seems your patch does that, so thanks! > But the question is also what effects your patch can have e.g. on RTL DSE. Looks like the patch caused a bootstrap for s390x. Eagerly awaiting a PR for that, but whoever is interested on that, please try Jakub's patch first... > 2012-11-26 Jakub Jelinek <jakub@redhat.com> > > PR debug/36728 > PR debug/55467 > * cselib.c (cselib_process_insn): If cselib_preserve_constants, > don't reset table and exit early on volatile insns and setjmp. > Reset table afterwards on setjmp. > > * gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm > are non-empty and add dependency between the first and second asm. > * gcc.dg/guality/pr36728-2.c: Likewise. > * gcc.dg/guality/pr36728-3.c: New test. > * gcc.dg/guality/pr36728-4.c: New test. brgds, H-P ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 12:04 ` Jakub Jelinek 2012-11-27 12:27 ` Hans-Peter Nilsson @ 2012-11-27 13:32 ` Jakub Jelinek 2012-11-27 17:39 ` Eric Botcazou ` (2 subsequent siblings) 4 siblings, 0 replies; 19+ messages in thread From: Jakub Jelinek @ 2012-11-27 13:32 UTC (permalink / raw) To: Hans-Peter Nilsson; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva On Tue, Nov 27, 2012 at 01:03:47PM +0100, Jakub Jelinek wrote: > So, at least from var-tracking POV which doesn't attempt to perform any > optimizations across any insn, just tries to track where values live, IMHO a > volatile asm acts exactly the same as non-volatile, that is why I'm testing > following patch right now. > > 2012-11-26 Jakub Jelinek <jakub@redhat.com> > > PR debug/36728 > PR debug/55467 > * cselib.c (cselib_process_insn): If cselib_preserve_constants, > don't reset table and exit early on volatile insns and setjmp. > Reset table afterwards on setjmp. > > * gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm > are non-empty and add dependency between the first and second asm. > * gcc.dg/guality/pr36728-2.c: Likewise. > * gcc.dg/guality/pr36728-3.c: New test. > * gcc.dg/guality/pr36728-4.c: New test. Bootstrapped/regtested fine on x86_64-linux and i686-linux (and fixes the pr36728-[34].c failures that appear without the patch, which are of the wrong debug kind). Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 12:04 ` Jakub Jelinek 2012-11-27 12:27 ` Hans-Peter Nilsson 2012-11-27 13:32 ` Jakub Jelinek @ 2012-11-27 17:39 ` Eric Botcazou 2012-11-27 18:20 ` Jakub Jelinek ` (2 more replies) 2012-11-28 13:44 ` Hans-Peter Nilsson 2012-11-28 20:53 ` Richard Henderson 4 siblings, 3 replies; 19+ messages in thread From: Eric Botcazou @ 2012-11-27 17:39 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches, Hans-Peter Nilsson, Alexandre Oliva > I strongly disagree with this. Outputs and clobbers have significant > meaning even on volatile asms, asm volatile doesn't mean all registers and > all memory are supposed to be considered clobbered. For memory you have > "memory" clobber for that, for registers it is users responsibility to > describe exactly what changes, either in clobbers or in outputs. > The difference between non-volatile and volatile asm is, as the > documentation states: > > The `volatile' keyword indicates that the instruction has important > side-effects. GCC will not delete a volatile `asm' if it is reachable. > > Volatile asm acts as an optimization barrier to some extent, but it really > can't modify registers or memory that aren't described as modified in the > asm pattern. The important side-effects are of some other kind than > modifying registers or memory visible from the current function. > Ditto for UNSPEC_VOLATILE. Well, the last sentence would essentially void the entire argument I think. It's well established in the RTL middle-end that UNSPEC_VOLATILE can do pretty much everything behind the back of the compiler. Now I agree that the discussion exists for volatile asms. But you have for example in the unmodified cse.c: /* A volatile ASM invalidates everything. */ if (NONJUMP_INSN_P (insn) && GET_CODE (PATTERN (insn)) == ASM_OPERANDS && MEM_VOLATILE_P (PATTERN (insn))) flush_hash_table (); and the comment of volatile_insn_p is rather explicit: /* Nonzero if X contains any volatile instructions. These are instructions which may cause unpredictable machine state instructions, and thus no instructions should be moved or combined across them. This includes only volatile asms and UNSPEC_VOLATILE instructions. */ The problem is that the various RTL passes don't have a consistent view on that so the patch attempts to tidy this up in a conservative manner. I think that a compromise could be to say that volatile asms without outputs are full barriers (like UNSPEC_VOLATILE) but other volatile asms aren't. That's what the unmodified cse.c, cselib.c and dse.c currently implement. But implementing it consistently would mean weakening volatile_insn_p. > So, at least from var-tracking POV which doesn't attempt to perform any > optimizations across any insn, just tries to track where values live, IMHO a > volatile asm acts exactly the same as non-volatile, that is why I'm testing > following patch right now. > > But the question is also what effects your patch can have e.g. on RTL DSE. It will make all volatile asms be treated as volatile asms without outputs. -- Eric Botcazou ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 17:39 ` Eric Botcazou @ 2012-11-27 18:20 ` Jakub Jelinek 2012-12-01 0:03 ` Eric Botcazou 2012-11-28 20:10 ` Richard Sandiford 2012-11-28 20:46 ` Richard Henderson 2 siblings, 1 reply; 19+ messages in thread From: Jakub Jelinek @ 2012-11-27 18:20 UTC (permalink / raw) To: Eric Botcazou; +Cc: gcc-patches, Hans-Peter Nilsson, Alexandre Oliva On Tue, Nov 27, 2012 at 06:35:52PM +0100, Eric Botcazou wrote: > Now I agree that the discussion exists for volatile asms. But you have for > example in the unmodified cse.c: > > /* A volatile ASM invalidates everything. */ > if (NONJUMP_INSN_P (insn) > && GET_CODE (PATTERN (insn)) == ASM_OPERANDS > && MEM_VOLATILE_P (PATTERN (insn))) > flush_hash_table (); That was weird indeed, because asm volatile ("" : : "r" (x)); flushed (for targets that don't add any implicit clobbers) but e.g. asm volatile ("" : : "r" (x) : "r23"); wouldn't (then the pattern is a parallel with ASM_OPERANDS and some CLOBBERs). The effect of that is that it never triggered on i?86/x86_64, because those have the implicit fprs/flags clobbers. I was mainly arguing with the sentence that for asm volatile, the output constraints (did you mean outputs and clobbers?) have essentially no meaning. While for some optimizations perhaps it might be desirable to treat asm volatile as full optimization barrier, I'm not sure about all of them (i.e. it would be important to look for performance degradations caused by that change), and for var-tracking I'd argue that asm vs. asm volatile is completely unimportant, if the asm volatile pattern (or even UNSPEC_VOLATILE) doesn't say it clobbers or sets hard register XYZ, then it can't change it (well, it could but is responsible of restoring it), similarly for memory, if it doesn't clobber "memory" and doesn't set some memory, it can't do that. So, do you object even to the var-tracking change (which would mean if var-tracking found that say some variable's value lives in hard register 12, when encountering asm volatile it would mean to forget about that even when hard register 12 isn't clobbered by the asm, nor set)? For now var-tracking seems to be the most important issue, as we generate invalid debug info there (latent before, but never hit on inline asm on x86_64/i686 except on setjmp calls). As for other passes, the r193802 change e.g. regresses slightly modified pr44194-1.c testcase on x86_64, struct ints { int a, b, c; } foo(); void bar(int a, int b); void func() { struct ints s = foo(); int x; asm volatile ("" : "=r" (x)); bar(s.a, s.b); } func: .LFB0: - xorl %eax, %eax subq $24, %rsp .LCFI0: + xorl %eax, %eax call foo - movq %rax, %rcx - shrq $32, %rcx - movq %rcx, %rsi - movl %eax, %edi + movq %rax, (%rsp) + movl %edx, 8(%rsp) + movl 4(%rsp), %esi + movl (%rsp), %edi addq $24, %rsp .LCFI1: jmp bar To me it looks like an unnecessary pessimization, the volatile there I understand that just it doesn't want to be moved around (e.g. scheduled before the foo call or after bar), that it can't be DCEd, but as it doesn't mention it modifies memory, I don't understand why it should force some registers to stack and back when it has no way to know the compiler would be emitting anything like that at all. Compare that to asm volatile ("" : "=r" (x) : : "memory"); in the testcase, where the r193802 commit makes no difference, the stores/loads are done in both cases. For CSE, I'd agree it should treat asm volatile as barriers, so for cselib.c we probably need some additional cselib_init flag how we want to treat volatile asms... For UNSPEC_VOLATILE, perhaps even DSE should stay conservative, but for var-tracking.c I still don't see a reason for that. Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 18:20 ` Jakub Jelinek @ 2012-12-01 0:03 ` Eric Botcazou 0 siblings, 0 replies; 19+ messages in thread From: Eric Botcazou @ 2012-12-01 0:03 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches, Hans-Peter Nilsson, Alexandre Oliva > I was mainly arguing with the sentence that for asm volatile, the output > constraints (did you mean outputs and clobbers?) have essentially no > meaning. While for some optimizations perhaps it might be desirable > to treat asm volatile as full optimization barrier, I'm not sure about all > of them (i.e. it would be important to look for performance degradations > caused by that change), and for var-tracking I'd argue that asm vs. asm > volatile is completely unimportant, if the asm volatile pattern (or even > UNSPEC_VOLATILE) doesn't say it clobbers or sets hard register XYZ, then it > can't change it (well, it could but is responsible of restoring it), > similarly for memory, if it doesn't clobber "memory" and doesn't set some > memory, it can't do that. Yes, I meant outputs and clobbers. The origin of all this is: `blockage' This pattern defines a pseudo insn that prevents the instruction scheduler and other passes from moving instructions and using register equivalences across the boundary defined by the blockage insn. This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM. Now the typical blockage insn is that of the x86: ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and ;; all of memory. This blocks insns from being moved across this point. (define_insn "blockage" [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)] "" "" [(set_attr "length" "0")]) [Note the pretty adamant comment about UNSPEC_VOLATILE here...] So, if UNSPEC_VOLATILE is not treated specially, the blockage insn of most ports won't block hard register equivalences. And blockages are precisely used to do that, see e.g. the untyped_call pattern of the x86: /* The optimizer does not know that the call sets the function value registers we stored in the result block. We avoid problems by claiming that all hard registers are used and clobbered at this point. */ emit_insn (gen_blockage ()); That being said, I agree that volatile asms are a separate discussion. > So, do you object even to the var-tracking change (which would mean if > var-tracking found that say some variable's value lives in hard register 12, > when encountering asm volatile it would mean to forget about that even when > hard register 12 isn't clobbered by the asm, nor set)? For now > var-tracking seems to be the most important issue, as we generate invalid > debug info there (latent before, but never hit on inline asm on x86_64/i686 > except on setjmp calls). For volatile asms, no, thanks for fixing the fallout on this side. > As for other passes, the r193802 change e.g. regresses slightly modified > pr44194-1.c testcase on x86_64, > struct ints { int a, b, c; } foo(); > void bar(int a, int b); > > void func() { > struct ints s = foo(); > int x; > asm volatile ("" : "=r" (x)); > bar(s.a, s.b); > } > > func: > .LFB0: > - xorl %eax, %eax > subq $24, %rsp > .LCFI0: > + xorl %eax, %eax > call foo > - movq %rax, %rcx > - shrq $32, %rcx > - movq %rcx, %rsi > - movl %eax, %edi > + movq %rax, (%rsp) > + movl %edx, 8(%rsp) > + movl 4(%rsp), %esi > + movl (%rsp), %edi > addq $24, %rsp > .LCFI1: > jmp bar > > To me it looks like an unnecessary pessimization, the volatile there > I understand that just it doesn't want to be moved around (e.g. scheduled > before the foo call or after bar), that it can't be DCEd, but as it doesn't > mention it modifies memory, I don't understand why it should force some > registers to stack and back when it has no way to know the compiler would be > emitting anything like that at all. > Compare that to > asm volatile ("" : "=r" (x) : : "memory"); > in the testcase, where the r193802 commit makes no difference, the > stores/loads are done in both cases. OK, I agree that the fallout for DSE, and the effect on memory in general, is undesirable, especially for volatile asms. > For CSE, I'd agree it should treat asm volatile as barriers, so for cselib.c > we probably need some additional cselib_init flag how we want to treat > volatile asms... For UNSPEC_VOLATILE, perhaps even DSE should stay > conservative, but for var-tracking.c I still don't see a reason for that. Thank you (as well as the others) for the detailed feedback. Clearly my initial stance on this was a bit extreme and needs to be watered down. I'll ponder about this over the week-end and propose something next week. -- Eric Botcazou ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 17:39 ` Eric Botcazou 2012-11-27 18:20 ` Jakub Jelinek @ 2012-11-28 20:10 ` Richard Sandiford 2012-11-28 20:46 ` Richard Henderson 2 siblings, 0 replies; 19+ messages in thread From: Richard Sandiford @ 2012-11-28 20:10 UTC (permalink / raw) To: Eric Botcazou Cc: Jakub Jelinek, gcc-patches, Hans-Peter Nilsson, Alexandre Oliva Eric Botcazou <ebotcazou@adacore.com> writes: >> I strongly disagree with this. Outputs and clobbers have significant >> meaning even on volatile asms, asm volatile doesn't mean all registers and >> all memory are supposed to be considered clobbered. For memory you have >> "memory" clobber for that, for registers it is users responsibility to >> describe exactly what changes, either in clobbers or in outputs. >> The difference between non-volatile and volatile asm is, as the >> documentation states: >> >> The `volatile' keyword indicates that the instruction has important >> side-effects. GCC will not delete a volatile `asm' if it is reachable. >> >> Volatile asm acts as an optimization barrier to some extent, but it really >> can't modify registers or memory that aren't described as modified in the >> asm pattern. The important side-effects are of some other kind than >> modifying registers or memory visible from the current function. >> Ditto for UNSPEC_VOLATILE. > > Well, the last sentence would essentially void the entire argument I > think. It's well established in the RTL middle-end that > UNSPEC_VOLATILE can do pretty much everything behind the back of the > compiler. As always when jumping in the middle of thread, I might well be missing the point sorry, but this sounded a bit extreme if taken literally. An UNSPEC_VOLATILE doesn't in itself force the function to save all call-saved registers, so an UNSPEC_VOLATILE that modifies those registers behind the back of the compiler would lead to us generating wrong code. And I thought UNSPEC_VOLATILEs that also clobber visible memory in an unpredictable way had to have something like (clobber (mem:BLK (scratch))). I thought Jakub's "the important side-effects are of some other kind than modifying registers or memory visible from the current function" applied to UNSPEC_VOLATILE too. Richard ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 17:39 ` Eric Botcazou 2012-11-27 18:20 ` Jakub Jelinek 2012-11-28 20:10 ` Richard Sandiford @ 2012-11-28 20:46 ` Richard Henderson 2 siblings, 0 replies; 19+ messages in thread From: Richard Henderson @ 2012-11-28 20:46 UTC (permalink / raw) To: Eric Botcazou Cc: Jakub Jelinek, gcc-patches, Hans-Peter Nilsson, Alexandre Oliva On 11/27/2012 09:35 AM, Eric Botcazou wrote: > It's well established in the RTL middle-end that UNSPEC_VOLATILE can do pretty > much everything behind the back of the compiler. This is false. U_V is a scheduling barrier, and is never deleted as dead (as opposed to unreachable) code. Certainly we don't do anything so complete as invalidate all registers, nor do we invalidate memory. r~ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 12:04 ` Jakub Jelinek ` (2 preceding siblings ...) 2012-11-27 17:39 ` Eric Botcazou @ 2012-11-28 13:44 ` Hans-Peter Nilsson 2012-11-28 13:56 ` Jakub Jelinek 2012-11-28 20:53 ` Richard Henderson 4 siblings, 1 reply; 19+ messages in thread From: Hans-Peter Nilsson @ 2012-11-28 13:44 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva On Tue, 27 Nov 2012, Jakub Jelinek wrote: > 2012-11-26 Jakub Jelinek <jakub@redhat.com> > > PR debug/36728 > PR debug/55467 > * cselib.c (cselib_process_insn): If cselib_preserve_constants, > don't reset table and exit early on volatile insns and setjmp. > Reset table afterwards on setjmp. It seems this also fixes the s390x-linux bootstrap; at least the test-case in PR bootstrap/55511. Thanks again. (N.B. there may also be a bug in var-tracking, covered up by the patch above.) brgds, H-P ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-28 13:44 ` Hans-Peter Nilsson @ 2012-11-28 13:56 ` Jakub Jelinek 0 siblings, 0 replies; 19+ messages in thread From: Jakub Jelinek @ 2012-11-28 13:56 UTC (permalink / raw) To: Hans-Peter Nilsson; +Cc: Eric Botcazou, gcc-patches, Alexandre Oliva On Wed, Nov 28, 2012 at 08:44:18AM -0500, Hans-Peter Nilsson wrote: > On Tue, 27 Nov 2012, Jakub Jelinek wrote: > > 2012-11-26 Jakub Jelinek <jakub@redhat.com> > > > > PR debug/36728 > > PR debug/55467 > > * cselib.c (cselib_process_insn): If cselib_preserve_constants, > > don't reset table and exit early on volatile insns and setjmp. > > Reset table afterwards on setjmp. > > It seems this also fixes the s390x-linux bootstrap; at least the > test-case in PR bootstrap/55511. Thanks again. > > (N.B. there may also be a bug in var-tracking, covered up by the > patch above.) The patch is actually a fix for that. The thing is that both cselib was doing the wrong thing for the resets (not calling cselib_preserve_only_constants () before cselib_reset_table resp. cselib_reset_table not prepared to the thing that it would need to flush all regs/mems, not just from the VALUEs that are kept in the hash table, but also from those that are dropped), and also not adding some special magic microoperation for the volatile insns (unclear what exactly it would have to do). By never handling the volatile insns specially during var-tracking all those issues are gone. Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver 2012-11-27 12:04 ` Jakub Jelinek ` (3 preceding siblings ...) 2012-11-28 13:44 ` Hans-Peter Nilsson @ 2012-11-28 20:53 ` Richard Henderson 4 siblings, 0 replies; 19+ messages in thread From: Richard Henderson @ 2012-11-28 20:53 UTC (permalink / raw) To: Jakub Jelinek Cc: Hans-Peter Nilsson, Eric Botcazou, gcc-patches, Alexandre Oliva On 11/27/2012 04:03 AM, Jakub Jelinek wrote: > 2012-11-26 Jakub Jelinek <jakub@redhat.com> > > PR debug/36728 > PR debug/55467 > * cselib.c (cselib_process_insn): If cselib_preserve_constants, > don't reset table and exit early on volatile insns and setjmp. > Reset table afterwards on setjmp. > > * gcc.dg/guality/pr36728-1.c: Include "../nop.h", make sure the asm > are non-empty and add dependency between the first and second asm. > * gcc.dg/guality/pr36728-2.c: Likewise. > * gcc.dg/guality/pr36728-3.c: New test. > * gcc.dg/guality/pr36728-4.c: New test. Ok. It appears that PRs 55507 and 55511 are also fixed. r~ ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-12-01 0:03 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-11-27 19:30 [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver Uros Bizjak 2012-11-28 9:50 ` Uros Bizjak -- strict thread matches above, loose matches on Subject: below -- 2012-11-12 4:49 Hans-Peter Nilsson 2012-11-12 8:28 ` Eric Botcazou 2012-11-19 3:54 ` Hans-Peter Nilsson 2012-11-19 9:35 ` Eric Botcazou 2012-11-26 3:30 ` Hans-Peter Nilsson 2012-11-27 11:44 ` Hans-Peter Nilsson 2012-11-27 12:04 ` Jakub Jelinek 2012-11-27 12:27 ` Hans-Peter Nilsson 2012-11-27 13:32 ` Jakub Jelinek 2012-11-27 17:39 ` Eric Botcazou 2012-11-27 18:20 ` Jakub Jelinek 2012-12-01 0:03 ` Eric Botcazou 2012-11-28 20:10 ` Richard Sandiford 2012-11-28 20:46 ` Richard Henderson 2012-11-28 13:44 ` Hans-Peter Nilsson 2012-11-28 13:56 ` Jakub Jelinek 2012-11-28 20:53 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).