Re: Fences/Barriers when mixing C++ atomics and non-atomics

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: Hans Boehm <hboehm@google.com>
To: Vineet Gupta <vineetg@rivosinc.com>
Cc: tech-unprivileged@lists.riscv.org, gcc@gcc.gnu.org,
	 Hongyu Wang <hongyu.wang@intel.com>,
	Uros Bizjak <ubizjak@gmail.com>
Subject: Re: Fences/Barriers when mixing C++ atomics and non-atomics
Date: Thu, 13 Oct 2022 13:54:25 -0700	[thread overview]
Message-ID: <CAMOCf+j9FsNC9o2AW-gQi-3mVU0AvvXqJarTrV8SBUouiU3eMQ@mail.gmail.com> (raw)
In-Reply-To: <8c7380d2-2587-78c7-a85a-a4c8afef2284@rivosinc.com>

[-- Attachment #1: Type: text/plain, Size: 3801 bytes --]

The generated code here is correct in both cases. In the RISC--V case, I
believe it is conservative, at a minimum, in that atomics should not imply
IO ordering. We had an earlier discussion, which seemed to have consensus
in favor of that opinion. I believe clang does not enforce IO ordering.

You can think of a "sequentially consistent" load roughly as enforcing two
properties:

1) It behaves as an "acquire" load. Later (in program order) memory
operations do not advance past it. This is implicit for x86. It requires
the trailing fence on RISC-V, which could probably be weakened to r,rw.

2) It ensures that seq_cst operations are fully ordered. This means that,
in addition to (1), and the corresponding fence for stores, every seq_cst
store must be separated from a seq_cst load by at least a w,r fence, so a
seq_cst store followed by a seq_cst load is not reordered. w,r fences are
discouraged on RISC-V, and probably no better than rw,rw, so that's how the
leading fence got there. (Again the io ordering should disappear. It's the
responsibility of IO code to insert that explicitly, rather than paying for
it everywhere.)

x86 does (2) by associating that fence with stores instead of loads, either
by using explicit fences after stores, or by turning stores into xchg.
RISC-V could do the same. And I believe that if the current A extension
were the final word on the architecture, it should. But that convention is
not compatible with the later introduction of an "acquire load", which I
think is essential for performance, at least on larger cores. So I think
the two fence mapping for loads should be maintained for now, as I
suggested in the document I posted to the list.

Hans

On Thu, Oct 13, 2022 at 12:31 PM Vineet Gupta <vineetg@rivosinc.com> wrote:

> Hi,
>
> I have a testcase (from real workloads) involving C++ atomics and trying
> to understand the codegen (gcc 12) for RVWMO and x86.
> It does mix atomics with non-atomics so not obvious what the behavior is
> intended to be hence some explicit CC of subject matter experts
> (apologies for that in advance).
>
> Test has a non-atomic store followed by an atomic_load(SEQ_CST). I
> assume that unadorned direct access defaults to safest/conservative
> seq_cst.
>
>     extern int g;
>     std::atomic<int> a;
>
>     int bar_noaccessor(int n, int *n2)
>     {
>          *n2 = g;
>          return n + a;
>     }
>
>     int bar_seqcst(int n, int *n2)
>     {
>          *n2 = g;
>          return n + a.load(std::memory_order_seq_cst);
>     }
>
> On RV (rvwmo), with current gcc 12 we get 2 full fences around the load
> as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from
> C/C++ to RISC-V primitives).
>
>     _Z10bar_seqcstiPi:
>     .LFB382:
>          .cfi_startproc
>          lui    a5,%hi(g)
>          lw    a5,%lo(g)(a5)
>          sw    a5,0(a1)
>     *fence    iorw,iorw*
>          lui    a5,%hi(a)
>          lw    a5,%lo(a)(a5)
>     *fence    iorw,iorw*
>          addw    a0,a5,a0
>          ret
>
>
> OTOH, for x86 (same default toggles) there's no barriers at all.
>
>     _Z10bar_seqcstiPi:
>          endbr64
>          movl    g(%rip), %eax
>          movl    %eax, (%rsi)
>          movl    a(%rip), %eax
>          addl    %edi, %eax
>          ret
>
>
> My naive intuition was x86 TSO would require a fence before
> load(seq_cst) for a prior store, even if that store was non atomic, so
> ensure load didn't bubble up ahead of store.
>
> Perhaps this begs the general question of intermixing non atomic
> accesses with atomics and if that is undefined behavior or some such. I
> skimmed through C++14 specification chapter Atomic Operations library
> but nothing's jumping out on the topic.
>
> Or is it much deeper, related to As-if rule or something.
>
> Thx,
> -Vineet
>

next prev parent reply	other threads:[~2022-10-13 20:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-13 19:31 Vineet Gupta
2022-10-13 20:15 ` Jonathan Wakely
2022-10-13 20:30 ` Uros Bizjak
2022-10-13 21:14   ` Vineet Gupta
2022-10-13 21:29     ` Uros Bizjak
2022-10-13 20:54 ` Hans Boehm [this message]
2022-10-13 21:11   ` Vineet Gupta
2022-10-13 21:43     ` Hans Boehm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMOCf+j9FsNC9o2AW-gQi-3mVU0AvvXqJarTrV8SBUouiU3eMQ@mail.gmail.com \
    --to=hboehm@google.com \
    --cc=gcc@gcc.gnu.org \
    --cc=hongyu.wang@intel.com \
    --cc=tech-unprivileged@lists.riscv.org \
    --cc=ubizjak@gmail.com \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).