* Re: [PING][PATCH 2/3] retire mem_signal_fence pattern
@ 2017-09-04 19:02 Uros Bizjak
2017-09-05 10:29 ` Alexander Monakov
0 siblings, 1 reply; 8+ messages in thread
From: Uros Bizjak @ 2017-09-04 19:02 UTC (permalink / raw)
To: gcc-patches; +Cc: Jeff Law, Alexander Monakov, H. J. Lu
> On 09/01/2017 12:26 AM, Alexander Monakov wrote:
>> On Thu, 31 Aug 2017, Jeff Law wrote:
>>> This is OK.
>>>
>>> What's the point of the delete_insns_since calls in patch #3?
>>
>> Deleting the first barrier when maybe_expand_insn failed.
>> Other functions in the file use a similar approach.
> Thanks for clarifying. OK for the trunk. Sorry about the wait.
It looks that:
2017-09-04 Alexander Monakov <amonakov@ispras.ru>
PR rtl-optimization/57448
PR target/67458
PR target/81316
* optabs.c (expand_atomic_load): Place compiler memory barriers if
using atomic_load pattern.
(expand_atomic_store): Likewise.
introduced a couple of regressions on x86 (-m32, 32bit) testsuite:
New failures:
FAIL: gcc.target/i386/pr71245-1.c scan-assembler-not (fistp|fild)
FAIL: gcc.target/i386/pr71245-2.c scan-assembler-not movlps
Uros.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PING][PATCH 2/3] retire mem_signal_fence pattern 2017-09-04 19:02 [PING][PATCH 2/3] retire mem_signal_fence pattern Uros Bizjak @ 2017-09-05 10:29 ` Alexander Monakov 2017-09-05 10:35 ` Uros Bizjak 2017-09-05 11:40 ` Uros Bizjak 0 siblings, 2 replies; 8+ messages in thread From: Alexander Monakov @ 2017-09-05 10:29 UTC (permalink / raw) To: Uros Bizjak; +Cc: gcc-patches, Jeff Law, H. J. Lu On Mon, 4 Sep 2017, Uros Bizjak wrote: > introduced a couple of regressions on x86 (-m32, 32bit) testsuite: > > New failures: > FAIL: gcc.target/i386/pr71245-1.c scan-assembler-not (fistp|fild) > FAIL: gcc.target/i386/pr71245-2.c scan-assembler-not movlps Sorry. I suggest that the tests be XFAIL'ed, the peepholes introduced in the fix for PR 71245 removed, and the PR reopened (it's a missed-optimization PR). I can do all of the above if you agree. I think RTL peepholes are a poor way of fixing the original problem, which actually exists on all targets with separate int/fp registers. For instance, trunk (without my patch) still gets a far simpler testcase wrong (-O2, 64-bit): float f(_Atomic float *p) { return *p; } f: movl (%rdi), %eax movl %eax, -4(%rsp) movss -4(%rsp), %xmm0 ret I believe adding more peepholes to handle this, independently in each affected backend, is hardly appropriate. The issue is caused by translation of floating-point atomic loads/stores to a sequence of integer atomic and a VIEW_CONVERT_EXPR on GIMPLE, which in turn causes the load to be emitted (wrongly) in an integer mode on RTL. I don't know if GIMPLE atomic builtins can work with floating operands (which IMHO would be a preferable way), but if not, then special-casing a sequence of integer atomic mem op and a VCE at expand time might be a cleaner way of achieving the desired optimization. Thanks. Alexander ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PING][PATCH 2/3] retire mem_signal_fence pattern 2017-09-05 10:29 ` Alexander Monakov @ 2017-09-05 10:35 ` Uros Bizjak 2017-09-05 11:40 ` Uros Bizjak 1 sibling, 0 replies; 8+ messages in thread From: Uros Bizjak @ 2017-09-05 10:35 UTC (permalink / raw) To: Alexander Monakov; +Cc: gcc-patches, Jeff Law, H. J. Lu On Tue, Sep 5, 2017 at 12:28 PM, Alexander Monakov <amonakov@ispras.ru> wrote: > On Mon, 4 Sep 2017, Uros Bizjak wrote: >> introduced a couple of regressions on x86 (-m32, 32bit) testsuite: >> >> New failures: >> FAIL: gcc.target/i386/pr71245-1.c scan-assembler-not (fistp|fild) >> FAIL: gcc.target/i386/pr71245-2.c scan-assembler-not movlps > > Sorry. I suggest that the tests be XFAIL'ed, the peepholes introduced in the > fix for PR 71245 removed, and the PR reopened (it's a missed-optimization PR). > I can do all of the above if you agree. No, I have a solution for the regression. A prerequisite is patch at [1]. [1] https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00239.html Uros. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PING][PATCH 2/3] retire mem_signal_fence pattern 2017-09-05 10:29 ` Alexander Monakov 2017-09-05 10:35 ` Uros Bizjak @ 2017-09-05 11:40 ` Uros Bizjak 2017-09-05 15:47 ` Christophe Lyon 1 sibling, 1 reply; 8+ messages in thread From: Uros Bizjak @ 2017-09-05 11:40 UTC (permalink / raw) To: Alexander Monakov; +Cc: gcc-patches, Jeff Law, H. J. Lu [-- Attachment #1: Type: text/plain, Size: 1068 bytes --] On Tue, Sep 5, 2017 at 12:28 PM, Alexander Monakov <amonakov@ispras.ru> wrote: > On Mon, 4 Sep 2017, Uros Bizjak wrote: >> introduced a couple of regressions on x86 (-m32, 32bit) testsuite: >> >> New failures: >> FAIL: gcc.target/i386/pr71245-1.c scan-assembler-not (fistp|fild) >> FAIL: gcc.target/i386/pr71245-2.c scan-assembler-not movlps > > Sorry. I suggest that the tests be XFAIL'ed, the peepholes introduced in the > fix for PR 71245 removed, and the PR reopened (it's a missed-optimization PR). > I can do all of the above if you agree. > > I think RTL peepholes are a poor way of fixing the original problem, which > actually exists on all targets with separate int/fp registers. For instance, > trunk (without my patch) still gets a far simpler testcase wrong (-O2, 64-bit): Please note that 32bit x86 implements atomic DImode access with fild/fistp combination, so the mentioned peephole avoids quite costly instruction sequence. For reference, attached patch implements additional peephole2 patterns that also handle sequences with blockages. Uros. [-- Attachment #2: q.diff.txt --] [-- Type: text/plain, Size: 5983 bytes --] diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index 29b82f86d43a..eceaa73a6799 100644 --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -219,29 +219,71 @@ (set (match_operand:DI 2 "memory_operand") (unspec:DI [(match_dup 0)] UNSPEC_FIST_ATOMIC)) - (set (match_operand:DF 3 "fp_register_operand") + (set (match_operand:DF 3 "any_fp_register_operand") (match_operand:DF 4 "memory_operand"))] "!TARGET_64BIT && peep2_reg_dead_p (2, operands[0]) - && rtx_equal_p (operands[4], adjust_address_nv (operands[2], DFmode, 0))" + && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))" [(set (match_dup 3) (match_dup 5))] "operands[5] = gen_lowpart (DFmode, operands[1]);") (define_peephole2 + [(set (match_operand:DF 0 "fp_register_operand") + (unspec:DF [(match_operand:DI 1 "memory_operand")] + UNSPEC_FILD_ATOMIC)) + (set (match_operand:DI 2 "memory_operand") + (unspec:DI [(match_dup 0)] + UNSPEC_FIST_ATOMIC)) + (set (mem:BLK (scratch:SI)) + (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE)) + (set (match_operand:DF 3 "any_fp_register_operand") + (match_operand:DF 4 "memory_operand"))] + "!TARGET_64BIT + && peep2_reg_dead_p (2, operands[0]) + && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))" + [(const_int 0)] +{ + emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1])); + emit_insn (gen_memory_blockage ()); + DONE; +}) + +(define_peephole2 [(set (match_operand:DF 0 "sse_reg_operand") (unspec:DF [(match_operand:DI 1 "memory_operand")] UNSPEC_LDX_ATOMIC)) (set (match_operand:DI 2 "memory_operand") (unspec:DI [(match_dup 0)] UNSPEC_STX_ATOMIC)) - (set (match_operand:DF 3 "fp_register_operand") + (set (match_operand:DF 3 "any_fp_register_operand") (match_operand:DF 4 "memory_operand"))] "!TARGET_64BIT && peep2_reg_dead_p (2, operands[0]) - && rtx_equal_p (operands[4], adjust_address_nv (operands[2], DFmode, 0))" + && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))" [(set (match_dup 3) (match_dup 5))] "operands[5] = gen_lowpart (DFmode, operands[1]);") +(define_peephole2 + [(set (match_operand:DF 0 "sse_reg_operand") + (unspec:DF [(match_operand:DI 1 "memory_operand")] + UNSPEC_LDX_ATOMIC)) + (set (match_operand:DI 2 "memory_operand") + (unspec:DI [(match_dup 0)] + UNSPEC_STX_ATOMIC)) + (set (mem:BLK (scratch:SI)) + (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE)) + (set (match_operand:DF 3 "any_fp_register_operand") + (match_operand:DF 4 "memory_operand"))] + "!TARGET_64BIT + && peep2_reg_dead_p (2, operands[0]) + && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))" + [(const_int 0)] +{ + emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1])); + emit_insn (gen_memory_blockage ()); + DONE; +}) + (define_expand "atomic_store<mode>" [(set (match_operand:ATOMIC 0 "memory_operand") (unspec:ATOMIC [(match_operand:ATOMIC 1 "nonimmediate_operand") @@ -331,7 +373,7 @@ (define_peephole2 [(set (match_operand:DF 0 "memory_operand") - (match_operand:DF 1 "fp_register_operand")) + (match_operand:DF 1 "any_fp_register_operand")) (set (match_operand:DF 2 "fp_register_operand") (unspec:DF [(match_operand:DI 3 "memory_operand")] UNSPEC_FILD_ATOMIC)) @@ -340,13 +382,34 @@ UNSPEC_FIST_ATOMIC))] "!TARGET_64BIT && peep2_reg_dead_p (3, operands[2]) - && rtx_equal_p (operands[0], adjust_address_nv (operands[3], DFmode, 0))" + && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" [(set (match_dup 5) (match_dup 1))] "operands[5] = gen_lowpart (DFmode, operands[4]);") (define_peephole2 [(set (match_operand:DF 0 "memory_operand") - (match_operand:DF 1 "fp_register_operand")) + (match_operand:DF 1 "any_fp_register_operand")) + (set (mem:BLK (scratch:SI)) + (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE)) + (set (match_operand:DF 2 "fp_register_operand") + (unspec:DF [(match_operand:DI 3 "memory_operand")] + UNSPEC_FILD_ATOMIC)) + (set (match_operand:DI 4 "memory_operand") + (unspec:DI [(match_dup 2)] + UNSPEC_FIST_ATOMIC))] + "!TARGET_64BIT + && peep2_reg_dead_p (4, operands[2]) + && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" + [(const_int 0)] +{ + emit_insn (gen_memory_blockage ()); + emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); + DONE; +}) + +(define_peephole2 + [(set (match_operand:DF 0 "memory_operand") + (match_operand:DF 1 "any_fp_register_operand")) (set (match_operand:DF 2 "sse_reg_operand") (unspec:DF [(match_operand:DI 3 "memory_operand")] UNSPEC_LDX_ATOMIC)) @@ -355,10 +418,31 @@ UNSPEC_STX_ATOMIC))] "!TARGET_64BIT && peep2_reg_dead_p (3, operands[2]) - && rtx_equal_p (operands[0], adjust_address_nv (operands[3], DFmode, 0))" + && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" [(set (match_dup 5) (match_dup 1))] "operands[5] = gen_lowpart (DFmode, operands[4]);") +(define_peephole2 + [(set (match_operand:DF 0 "memory_operand") + (match_operand:DF 1 "any_fp_register_operand")) + (set (mem:BLK (scratch:SI)) + (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE)) + (set (match_operand:DF 2 "sse_reg_operand") + (unspec:DF [(match_operand:DI 3 "memory_operand")] + UNSPEC_LDX_ATOMIC)) + (set (match_operand:DI 4 "memory_operand") + (unspec:DI [(match_dup 2)] + UNSPEC_STX_ATOMIC))] + "!TARGET_64BIT + && peep2_reg_dead_p (4, operands[2]) + && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" + [(const_int 0)] +{ + emit_insn (gen_memory_blockage ()); + emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); + DONE; +}) + ;; ??? You'd think that we'd be able to perform this via FLOAT + FIX_TRUNC ;; operations. But the fix_trunc patterns want way more setup than we want ;; to provide. Note that the scratch is DFmode instead of XFmode in order ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PING][PATCH 2/3] retire mem_signal_fence pattern 2017-09-05 11:40 ` Uros Bizjak @ 2017-09-05 15:47 ` Christophe Lyon 0 siblings, 0 replies; 8+ messages in thread From: Christophe Lyon @ 2017-09-05 15:47 UTC (permalink / raw) To: Uros Bizjak; +Cc: Alexander Monakov, gcc-patches, Jeff Law, H. J. Lu Hi, On 5 September 2017 at 13:40, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Sep 5, 2017 at 12:28 PM, Alexander Monakov <amonakov@ispras.ru> wrote: >> On Mon, 4 Sep 2017, Uros Bizjak wrote: >>> introduced a couple of regressions on x86 (-m32, 32bit) testsuite: >>> >>> New failures: >>> FAIL: gcc.target/i386/pr71245-1.c scan-assembler-not (fistp|fild) >>> FAIL: gcc.target/i386/pr71245-2.c scan-assembler-not movlps >> >> Sorry. I suggest that the tests be XFAIL'ed, the peepholes introduced in the >> fix for PR 71245 removed, and the PR reopened (it's a missed-optimization PR). >> I can do all of the above if you agree. >> >> I think RTL peepholes are a poor way of fixing the original problem, which >> actually exists on all targets with separate int/fp registers. For instance, >> trunk (without my patch) still gets a far simpler testcase wrong (-O2, 64-bit): > > Please note that 32bit x86 implements atomic DImode access with > fild/fistp combination, so the mentioned peephole avoids quite costly > instruction sequence. > > For reference, attached patch implements additional peephole2 patterns > that also handle sequences with blockages. > > Uros. On arm, we also have a similar regression: FAIL: gcc.target/arm/stl-cond.c scan-assembler stlne Before the patch, we generated: ldr r3, [r0] cmp r3, #0 addne r3, r0, #4 movne r2, #0 stlne r2, [r3] bx lr and now: ldr r3, [r0] cmp r3, #0 bxeq lr mov r2, #0 add r3, r0, #4 stl r2, [r3] bx lr Christophe ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] optabs: ensure mem_thread_fence is a compiler barrier @ 2017-08-02 17:45 Alexander Monakov 2017-08-02 17:45 ` [PATCH 2/3] retire mem_signal_fence pattern Alexander Monakov 0 siblings, 1 reply; 8+ messages in thread From: Alexander Monakov @ 2017-08-02 17:45 UTC (permalink / raw) To: gcc-patches As recently discussed in the previous thread for PR 80640, some targets have sufficiently strong hardware memory ordering that implementation of C11 atomic fences might not need machine barriers. However, a compiler memory barrier nevertheless must be present, and at least two targets (x86, s390) got this wrong in their .md expanders. Handle it in the middle-end by detecting empty expansion and emitting a compiler memory barrier. If the backend produced non-empty RTL, expect that it's using the same RTX structure as in "memory_barrier" (__sync_synchronize) expansion, which must be a compiler memory on its own. A followup refactor is possible in expand_mem_thread_fence to make it start with if (is_mm_relaxed (model)) return; as it's not useful to pass __ATOMIC_RELAXED model to gen_mem_thread_fence. PR target/80640 * doc/md.texi (mem_thread_fence): Remove mention of mode. Clarify that empty expansion is handled by placing a compiler barrier. * optabs.c (expand_mem_thread_fence): Place a compiler barrier if targetm.gen_mem_thread_fence produced no instructions. testsuite/ * gcc.dg/atomic/pr80640.c: New testcase. --- gcc/doc/md.texi | 9 +++++++-- gcc/optabs.c | 7 ++++++- gcc/testsuite/gcc.dg/atomic/pr80640.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 47 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/atomic/pr80640.c diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index ea959208c98..0be161a08dc 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -7044,11 +7044,16 @@ If these patterns are not defined, attempts will be made to use counterparts. If none of these are available a compare-and-swap loop will be used. -@cindex @code{mem_thread_fence@var{mode}} instruction pattern -@item @samp{mem_thread_fence@var{mode}} +@cindex @code{mem_thread_fence} instruction pattern +@item @samp{mem_thread_fence} This pattern emits code required to implement a thread fence with memory model semantics. Operand 0 is the memory model to be used. +Expanding this pattern may produce no instructions if no machine barrier +is required on the target for given memory model. In that case, unless +memory model is @code{__ATOMIC_RELAXED}, a compiler memory barrier is +emitted automatically. + If this pattern is not specified, all memory models except @code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize} barrier pattern. diff --git a/gcc/optabs.c b/gcc/optabs.c index a9900657a58..d568ca376fe 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -6298,7 +6298,12 @@ void expand_mem_thread_fence (enum memmodel model) { if (targetm.have_mem_thread_fence ()) - emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model))); + { + rtx_insn *last = get_last_insn (); + emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model))); + if (last == get_last_insn () && !is_mm_relaxed (model)) + expand_asm_memory_barrier (); + } else if (!is_mm_relaxed (model)) { if (targetm.have_memory_barrier ()) diff --git a/gcc/testsuite/gcc.dg/atomic/pr80640.c b/gcc/testsuite/gcc.dg/atomic/pr80640.c new file mode 100644 index 00000000000..fd17978a482 --- /dev/null +++ b/gcc/testsuite/gcc.dg/atomic/pr80640.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ +/* { dg-options "-pthread" } */ +/* { dg-require-effective-target pthread } */ + +#include <pthread.h> + +static volatile int sem1; +static volatile int sem2; + +static void *f(void *va) +{ + void **p = va; + if (*p) return *p; + sem1 = 1; + while (!sem2); + __atomic_thread_fence(__ATOMIC_ACQUIRE); + // GCC used to RTL-CSE this and the first load, causing 0 to be returned + return *p; +} + +int main() +{ + void *p = 0; + pthread_t thr; + if (pthread_create(&thr, 0, f, &p)) + return 2; + while (!sem1); + __atomic_thread_fence(__ATOMIC_ACQUIRE); + p = &p; + __atomic_thread_fence(__ATOMIC_RELEASE); + sem2 = 1; + pthread_join(thr, &p); + return !p; +} -- 2.13.3 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/3] retire mem_signal_fence pattern 2017-08-02 17:45 [PATCH 1/3] optabs: ensure mem_thread_fence is a compiler barrier Alexander Monakov @ 2017-08-02 17:45 ` Alexander Monakov 2017-08-28 12:39 ` [PING][PATCH " Alexander Monakov 0 siblings, 1 reply; 8+ messages in thread From: Alexander Monakov @ 2017-08-02 17:45 UTC (permalink / raw) To: gcc-patches Similar to mem_thread_fence issue from the patch 1/3, RTL representation of __atomic_signal_fence must be a compiler barrier. We have just one backend offering this pattern, and it does not place a compiler barrier. It does not appear useful to expand signal_fence to some kind of hardware instruction, its documentation is wrong/outdated, and we are using it incorrectly anyway. So just remove the whole thing and simply emit a compiler memory barrier in the optabs.c handler. * config/s390/s390.md (mem_signal_fence): Remove. * doc/md.texi (mem_signal_fence): Remove. * optabs.c (expand_mem_signal_fence): Remove uses of mem_signal_fence. Update comments. * target-insns.def (mem_signal_fence): Remove. --- gcc/config/s390/s390.md | 9 --------- gcc/doc/md.texi | 13 ------------- gcc/optabs.c | 17 +++++------------ gcc/target-insns.def | 1 - 4 files changed, 5 insertions(+), 35 deletions(-) diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index d1ac0b8395d..1d63523d3b0 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -10084,15 +10084,6 @@ (define_insn "*basr_tls" ; memory barrier patterns. ; -(define_expand "mem_signal_fence" - [(match_operand:SI 0 "const_int_operand")] ;; model - "" -{ - /* The s390 memory model is strong enough not to require any - barrier in order to synchronize a thread with itself. */ - DONE; -}) - (define_expand "mem_thread_fence" [(match_operand:SI 0 "const_int_operand")] ;; model "" diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 0be161a08dc..73d0e7d83c0 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -7058,19 +7058,6 @@ If this pattern is not specified, all memory models except @code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize} barrier pattern. -@cindex @code{mem_signal_fence@var{mode}} instruction pattern -@item @samp{mem_signal_fence@var{mode}} -This pattern emits code required to implement a signal fence with -memory model semantics. Operand 0 is the memory model to be used. - -This pattern should impact the compiler optimizers the same way that -mem_signal_fence does, but it does not need to issue any barrier -instructions. - -If this pattern is not specified, all memory models except -@code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize} -barrier pattern. - @cindex @code{get_thread_pointer@var{mode}} instruction pattern @cindex @code{set_thread_pointer@var{mode}} instruction pattern @item @samp{get_thread_pointer@var{mode}} diff --git a/gcc/optabs.c b/gcc/optabs.c index d568ca376fe..6884fd4b8aa 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -6315,22 +6315,15 @@ expand_mem_thread_fence (enum memmodel model) } } -/* This routine will either emit the mem_signal_fence pattern or issue a - sync_synchronize to generate a fence for memory model MEMMODEL. */ +/* Emit a signal fence with given memory model. */ void expand_mem_signal_fence (enum memmodel model) { - if (targetm.have_mem_signal_fence ()) - emit_insn (targetm.gen_mem_signal_fence (GEN_INT (model))); - else if (!is_mm_relaxed (model)) - { - /* By default targets are coherent between a thread and the signal - handler running on the same thread. Thus this really becomes a - compiler barrier, in that stores must not be sunk past - (or raised above) a given point. */ - expand_asm_memory_barrier (); - } + /* No machine barrier is required to implement a signal fence, but + a compiler memory barrier must be issued, except for relaxed MM. */ + if (!is_mm_relaxed (model)) + expand_asm_memory_barrier (); } /* This function expands the atomic load operation: diff --git a/gcc/target-insns.def b/gcc/target-insns.def index fb92f72ac29..4669439c7e1 100644 --- a/gcc/target-insns.def +++ b/gcc/target-insns.def @@ -58,7 +58,6 @@ DEF_TARGET_INSN (indirect_jump, (rtx x0)) DEF_TARGET_INSN (insv, (rtx x0, rtx x1, rtx x2, rtx x3)) DEF_TARGET_INSN (jump, (rtx x0)) DEF_TARGET_INSN (load_multiple, (rtx x0, rtx x1, rtx x2)) -DEF_TARGET_INSN (mem_signal_fence, (rtx x0)) DEF_TARGET_INSN (mem_thread_fence, (rtx x0)) DEF_TARGET_INSN (memory_barrier, (void)) DEF_TARGET_INSN (movstr, (rtx x0, rtx x1, rtx x2)) -- 2.13.3 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PING][PATCH 2/3] retire mem_signal_fence pattern 2017-08-02 17:45 ` [PATCH 2/3] retire mem_signal_fence pattern Alexander Monakov @ 2017-08-28 12:39 ` Alexander Monakov [not found] ` <d738f0d1-e9d1-bfb2-e413-4d4b78370cdb@redhat.com> 0 siblings, 1 reply; 8+ messages in thread From: Alexander Monakov @ 2017-08-28 12:39 UTC (permalink / raw) To: gcc-patches Ping (for this and patch 3/3 in the thread). On Wed, 2 Aug 2017, Alexander Monakov wrote: > Similar to mem_thread_fence issue from the patch 1/3, RTL representation of > __atomic_signal_fence must be a compiler barrier. We have just one backend > offering this pattern, and it does not place a compiler barrier. > > It does not appear useful to expand signal_fence to some kind of hardware > instruction, its documentation is wrong/outdated, and we are using it > incorrectly anyway. So just remove the whole thing and simply emit a compiler > memory barrier in the optabs.c handler. > > * config/s390/s390.md (mem_signal_fence): Remove. > * doc/md.texi (mem_signal_fence): Remove. > * optabs.c (expand_mem_signal_fence): Remove uses of mem_signal_fence. > Update comments. > * target-insns.def (mem_signal_fence): Remove. > > --- > gcc/config/s390/s390.md | 9 --------- > gcc/doc/md.texi | 13 ------------- > gcc/optabs.c | 17 +++++------------ > gcc/target-insns.def | 1 - > 4 files changed, 5 insertions(+), 35 deletions(-) > > diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md > index d1ac0b8395d..1d63523d3b0 100644 > --- a/gcc/config/s390/s390.md > +++ b/gcc/config/s390/s390.md > @@ -10084,15 +10084,6 @@ (define_insn "*basr_tls" > ; memory barrier patterns. > ; > > -(define_expand "mem_signal_fence" > - [(match_operand:SI 0 "const_int_operand")] ;; model > - "" > -{ > - /* The s390 memory model is strong enough not to require any > - barrier in order to synchronize a thread with itself. */ > - DONE; > -}) > - > (define_expand "mem_thread_fence" > [(match_operand:SI 0 "const_int_operand")] ;; model > "" > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index 0be161a08dc..73d0e7d83c0 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -7058,19 +7058,6 @@ If this pattern is not specified, all memory models except > @code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize} > barrier pattern. > > -@cindex @code{mem_signal_fence@var{mode}} instruction pattern > -@item @samp{mem_signal_fence@var{mode}} > -This pattern emits code required to implement a signal fence with > -memory model semantics. Operand 0 is the memory model to be used. > - > -This pattern should impact the compiler optimizers the same way that > -mem_signal_fence does, but it does not need to issue any barrier > -instructions. > - > -If this pattern is not specified, all memory models except > -@code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize} > -barrier pattern. > - > @cindex @code{get_thread_pointer@var{mode}} instruction pattern > @cindex @code{set_thread_pointer@var{mode}} instruction pattern > @item @samp{get_thread_pointer@var{mode}} > diff --git a/gcc/optabs.c b/gcc/optabs.c > index d568ca376fe..6884fd4b8aa 100644 > --- a/gcc/optabs.c > +++ b/gcc/optabs.c > @@ -6315,22 +6315,15 @@ expand_mem_thread_fence (enum memmodel model) > } > } > > -/* This routine will either emit the mem_signal_fence pattern or issue a > - sync_synchronize to generate a fence for memory model MEMMODEL. */ > +/* Emit a signal fence with given memory model. */ > > void > expand_mem_signal_fence (enum memmodel model) > { > - if (targetm.have_mem_signal_fence ()) > - emit_insn (targetm.gen_mem_signal_fence (GEN_INT (model))); > - else if (!is_mm_relaxed (model)) > - { > - /* By default targets are coherent between a thread and the signal > - handler running on the same thread. Thus this really becomes a > - compiler barrier, in that stores must not be sunk past > - (or raised above) a given point. */ > - expand_asm_memory_barrier (); > - } > + /* No machine barrier is required to implement a signal fence, but > + a compiler memory barrier must be issued, except for relaxed MM. */ > + if (!is_mm_relaxed (model)) > + expand_asm_memory_barrier (); > } > > /* This function expands the atomic load operation: > diff --git a/gcc/target-insns.def b/gcc/target-insns.def > index fb92f72ac29..4669439c7e1 100644 > --- a/gcc/target-insns.def > +++ b/gcc/target-insns.def > @@ -58,7 +58,6 @@ DEF_TARGET_INSN (indirect_jump, (rtx x0)) > DEF_TARGET_INSN (insv, (rtx x0, rtx x1, rtx x2, rtx x3)) > DEF_TARGET_INSN (jump, (rtx x0)) > DEF_TARGET_INSN (load_multiple, (rtx x0, rtx x1, rtx x2)) > -DEF_TARGET_INSN (mem_signal_fence, (rtx x0)) > DEF_TARGET_INSN (mem_thread_fence, (rtx x0)) > DEF_TARGET_INSN (memory_barrier, (void)) > DEF_TARGET_INSN (movstr, (rtx x0, rtx x1, rtx x2)) > ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <d738f0d1-e9d1-bfb2-e413-4d4b78370cdb@redhat.com>]
* Re: [PING][PATCH 2/3] retire mem_signal_fence pattern [not found] ` <d738f0d1-e9d1-bfb2-e413-4d4b78370cdb@redhat.com> @ 2017-09-01 6:26 ` Alexander Monakov 2017-09-04 4:56 ` Jeff Law 0 siblings, 1 reply; 8+ messages in thread From: Alexander Monakov @ 2017-09-01 6:26 UTC (permalink / raw) To: Jeff Law; +Cc: gcc-patches On Thu, 31 Aug 2017, Jeff Law wrote: > This is OK. > > What's the point of the delete_insns_since calls in patch #3? Deleting the first barrier when maybe_expand_insn failed. Other functions in the file use a similar approach. Thanks. Alexander ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PING][PATCH 2/3] retire mem_signal_fence pattern 2017-09-01 6:26 ` Alexander Monakov @ 2017-09-04 4:56 ` Jeff Law 0 siblings, 0 replies; 8+ messages in thread From: Jeff Law @ 2017-09-04 4:56 UTC (permalink / raw) To: Alexander Monakov; +Cc: gcc-patches On 09/01/2017 12:26 AM, Alexander Monakov wrote: > On Thu, 31 Aug 2017, Jeff Law wrote: >> This is OK. >> >> What's the point of the delete_insns_since calls in patch #3? > > Deleting the first barrier when maybe_expand_insn failed. > Other functions in the file use a similar approach. Thanks for clarifying. OK for the trunk. Sorry about the wait. Jeff ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-09-05 15:47 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-09-04 19:02 [PING][PATCH 2/3] retire mem_signal_fence pattern Uros Bizjak 2017-09-05 10:29 ` Alexander Monakov 2017-09-05 10:35 ` Uros Bizjak 2017-09-05 11:40 ` Uros Bizjak 2017-09-05 15:47 ` Christophe Lyon -- strict thread matches above, loose matches on Subject: below -- 2017-08-02 17:45 [PATCH 1/3] optabs: ensure mem_thread_fence is a compiler barrier Alexander Monakov 2017-08-02 17:45 ` [PATCH 2/3] retire mem_signal_fence pattern Alexander Monakov 2017-08-28 12:39 ` [PING][PATCH " Alexander Monakov [not found] ` <d738f0d1-e9d1-bfb2-e413-4d4b78370cdb@redhat.com> 2017-09-01 6:26 ` Alexander Monakov 2017-09-04 4:56 ` Jeff Law
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).