From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-459412-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 17283 invoked by alias); 31 Jul 2017 18:01:34 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 17012 invoked by uid 89); 31 Jul 2017 18:01:34 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=standpoint, Hx-languages-length:2149, person
X-HELO: smtp.ispras.ru
Received: from bran.ispras.ru (HELO smtp.ispras.ru) (83.149.199.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 31 Jul 2017 18:01:31 +0000
Received: from monopod.intra.ispras.ru (monopod.intra.ispras.ru [10.10.3.121])	by smtp.ispras.ru (Postfix) with ESMTP id 71BC05FB44;	Mon, 31 Jul 2017 21:01:28 +0300 (MSK)
Date: Mon, 31 Jul 2017 18:01:00 -0000
From: Alexander Monakov <amonakov@ispras.ru>
To: Jeff Law <law@redhat.com>
cc: gcc-patches@gcc.gnu.org, Richard Henderson <rth@redhat.com>,     Uros Bizjak <ubizjak@gmail.com>
Subject: Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)
In-Reply-To: <ef686357-8dac-41f6-4488-1c68f94e5bc7@redhat.com>
Message-ID: <alpine.LNX.2.20.13.1707312034110.2270@monopod.intra.ispras.ru>
References: <alpine.LNX.2.20.13.1705101620050.11444@monopod.intra.ispras.ru> <alpine.LNX.2.20.13.1705171248330.32526@monopod.intra.ispras.ru> <alpine.LNX.2.20.13.1705261358000.13474@monopod.intra.ispras.ru> <793d0ecb-7fbf-6670-c45b-9b1d436fc2fb@redhat.com> <alpine.LNX.2.20.13.1707262011470.12525@monopod.intra.ispras.ru> <499f27d4-7abc-a0a1-34af-2de2710058dc@redhat.com> <alpine.LNX.2.20.13.1707262040150.12525@monopod.intra.ispras.ru> <85f131c5-3ebd-0ed9-4bc1-78a9482c841c@redhat.com> <alpine.LNX.2.20.13.1707311947280.2270@monopod.intra.ispras.ru> <ef686357-8dac-41f6-4488-1c68f94e5bc7@redhat.com>
User-Agent: Alpine 2.20.13 (LNX 116 2015-12-14)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-SW-Source: 2017-07/txt/msg02061.txt.bz2

On Mon, 31 Jul 2017, Jeff Law wrote:
> > Please consider that expand_mem_thread_fence is used to place fences around
> > seq-cst atomic loads&stores when the backend doesn't provide a direct pattern.
> > With compiler barriers on both sides of the machine barrier, the generated
> > sequence for a seq-cst atomic load will be 7 insns:
> > 
> >   asm volatile ("":::"memory");
> >   machine_seq_cst_fence ();
> >   asm volatile ("":::"memory");
> >   dst = mem[src];
> >   asm volatile ("":::"memory");
> >   machine_seq_cst_fence ();
> >   asm volatile ("":::"memory");
> > 
> > I can easily imagine people looking at RTL dumps with this overkill fencing
> > being unhappy about this.
> But the extra fences aren't actually going to impact anything except
> perhaps an unmeasurable compile-time hit.  ie, it may look bad, but I'd
> have a hard time believing it matters in practice.

I agree it doesn't matter that much from compile-time/memory standpoint.
I meant it matters from the standpoint of a person working with the RTL dumps,
i.e. having to work through all that, especially if they need to work with
non-slim dumps.

> > I'd be more happy with detecting empty expansion via get_last_insn ().
> That seems like an unnecessary complication to me.

I think it's quite simple by GCC standards:

  {
    rtx_insn last = get_last_insn ();
    emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
    if (last == get_last_insn () && !is_mm_relaxed (model))
      expand_asm_memory_barrier ();
  }

> I'd prefer instead to just emit the necessary fencing in the generic code
> and update the MD docs so that maintainers know they don't have to emit
> the RTL fences themselves.

I agree the docs need an update, but hopefully not in this way.  The legacy
__sync_synchronize barrier has always been required to be a compiler barrier
when expanded to RTL, and it's quite natural to use the same RTL structure
for new atomic fences as the old memory_barrier expansion.  The only problem
in current practice seen so far is with empty expansion for new patterns.

Thanks.
Alexander