From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17283 invoked by alias); 31 Jul 2017 18:01:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 17012 invoked by uid 89); 31 Jul 2017 18:01:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=standpoint, Hx-languages-length:2149, person X-HELO: smtp.ispras.ru Received: from bran.ispras.ru (HELO smtp.ispras.ru) (83.149.199.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 31 Jul 2017 18:01:31 +0000 Received: from monopod.intra.ispras.ru (monopod.intra.ispras.ru [10.10.3.121]) by smtp.ispras.ru (Postfix) with ESMTP id 71BC05FB44; Mon, 31 Jul 2017 21:01:28 +0300 (MSK) Date: Mon, 31 Jul 2017 18:01:00 -0000 From: Alexander Monakov To: Jeff Law cc: gcc-patches@gcc.gnu.org, Richard Henderson , Uros Bizjak Subject: Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640) In-Reply-To: Message-ID: References: <793d0ecb-7fbf-6670-c45b-9b1d436fc2fb@redhat.com> <499f27d4-7abc-a0a1-34af-2de2710058dc@redhat.com> <85f131c5-3ebd-0ed9-4bc1-78a9482c841c@redhat.com> User-Agent: Alpine 2.20.13 (LNX 116 2015-12-14) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-SW-Source: 2017-07/txt/msg02061.txt.bz2 On Mon, 31 Jul 2017, Jeff Law wrote: > > Please consider that expand_mem_thread_fence is used to place fences around > > seq-cst atomic loads&stores when the backend doesn't provide a direct pattern. > > With compiler barriers on both sides of the machine barrier, the generated > > sequence for a seq-cst atomic load will be 7 insns: > > > > asm volatile ("":::"memory"); > > machine_seq_cst_fence (); > > asm volatile ("":::"memory"); > > dst = mem[src]; > > asm volatile ("":::"memory"); > > machine_seq_cst_fence (); > > asm volatile ("":::"memory"); > > > > I can easily imagine people looking at RTL dumps with this overkill fencing > > being unhappy about this. > But the extra fences aren't actually going to impact anything except > perhaps an unmeasurable compile-time hit. ie, it may look bad, but I'd > have a hard time believing it matters in practice. I agree it doesn't matter that much from compile-time/memory standpoint. I meant it matters from the standpoint of a person working with the RTL dumps, i.e. having to work through all that, especially if they need to work with non-slim dumps. > > I'd be more happy with detecting empty expansion via get_last_insn (). > That seems like an unnecessary complication to me. I think it's quite simple by GCC standards: { rtx_insn last = get_last_insn (); emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model))); if (last == get_last_insn () && !is_mm_relaxed (model)) expand_asm_memory_barrier (); } > I'd prefer instead to just emit the necessary fencing in the generic code > and update the MD docs so that maintainers know they don't have to emit > the RTL fences themselves. I agree the docs need an update, but hopefully not in this way. The legacy __sync_synchronize barrier has always been required to be a compiler barrier when expanded to RTL, and it's quite natural to use the same RTL structure for new atomic fences as the old memory_barrier expansion. The only problem in current practice seen so far is with empty expansion for new patterns. Thanks. Alexander