From: Vineet Gupta <vineetg@rivosinc.com>
To: Hans Boehm <hboehm@google.com>
Cc: tech-unprivileged@lists.riscv.org, gcc@gcc.gnu.org,
Hongyu Wang <hongyu.wang@intel.com>,
Uros Bizjak <ubizjak@gmail.com>
Subject: Re: Fences/Barriers when mixing C++ atomics and non-atomics
Date: Thu, 13 Oct 2022 14:11:13 -0700 [thread overview]
Message-ID: <9838fc0f-a519-bfd3-3eb2-88255742ae57@rivosinc.com> (raw)
In-Reply-To: <CAMOCf+j9FsNC9o2AW-gQi-3mVU0AvvXqJarTrV8SBUouiU3eMQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4981 bytes --]
Hi Hans,
On 10/13/22 13:54, Hans Boehm wrote:
> The generated code here is correct in both cases. In the RISC--V case,
> I believe it is conservative, at a minimum, in that atomics should not
> imply IO ordering. We had an earlier discussion, which seemed to have
> consensus in favor of that opinion. I believe clang does not enforce
> IO ordering.
>
> You can think of a "sequentially consistent" load roughly as enforcing
> two properties:
>
> 1) It behaves as an "acquire" load. Later (in program order) memory
> operations do not advance past it. This is implicit for x86. It
> requires the trailing fence on RISC-V, which could probably be
> weakened to r,rw.
Acq implies later things won't leak out, but prior things could still
leak-in, meaning prior write could happen after load which contradicts
what user is asking by load(seq_cst) on x86 ?
>
> 2) It ensures that seq_cst operations are fully ordered. This means
> that, in addition to (1), and the corresponding fence for stores,
> every seq_cst store must be separated from a seq_cst load by at least
> a w,r fence, so a seq_cst store followed by a seq_cst load is not
> reordered.
This makes sense when both store -> load are seq_cst.
But the question is what happens when that store is non atomic. IOW if
we had a store(relaxed) -> load(seq_cst) would the generated code still
ensure that load had a full barrier to prevent
> w,r fences are discouraged on RISC-V, and probably no better than
> rw,rw, so that's how the leading fence got there. (Again the io
> ordering should disappear. It's the responsibility of IO code to
> insert that explicitly, rather than paying for it everywhere.)
Thanks for explaining the RV semantics.
>
> x86 does (2) by associating that fence with stores instead of loads,
> either by using explicit fences after stores, or by turning stores
> into xchg.
That makes sense as x86 has ld->ld and ld -> st architecturally ordered,
so any fences ought to be associated with st.
Thx,
-Vineet
> RISC-V could do the same. And I believe that if the current A
> extension were the final word on the architecture, it should. But that
> convention is not compatible with the later introduction of an
> "acquire load", which I think is essential for performance, at least
> on larger cores. So I think the two fence mapping for loads should be
> maintained for now, as I suggested in the document I posted to the list.
>
> Hans
>
> On Thu, Oct 13, 2022 at 12:31 PM Vineet Gupta <vineetg@rivosinc.com>
> wrote:
>
> Hi,
>
> I have a testcase (from real workloads) involving C++ atomics and
> trying
> to understand the codegen (gcc 12) for RVWMO and x86.
> It does mix atomics with non-atomics so not obvious what the
> behavior is
> intended to be hence some explicit CC of subject matter experts
> (apologies for that in advance).
>
> Test has a non-atomic store followed by an atomic_load(SEQ_CST). I
> assume that unadorned direct access defaults to
> safest/conservative seq_cst.
>
> extern int g;
> std::atomic<int> a;
>
> int bar_noaccessor(int n, int *n2)
> {
> *n2 = g;
> return n + a;
> }
>
> int bar_seqcst(int n, int *n2)
> {
> *n2 = g;
> return n + a.load(std::memory_order_seq_cst);
> }
>
> On RV (rvwmo), with current gcc 12 we get 2 full fences around the
> load
> as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from
> C/C++ to RISC-V primitives).
>
> _Z10bar_seqcstiPi:
> .LFB382:
> .cfi_startproc
> lui a5,%hi(g)
> lw a5,%lo(g)(a5)
> sw a5,0(a1)
> *fence iorw,iorw*
> lui a5,%hi(a)
> lw a5,%lo(a)(a5)
> *fence iorw,iorw*
> addw a0,a5,a0
> ret
>
>
> OTOH, for x86 (same default toggles) there's no barriers at all.
>
> _Z10bar_seqcstiPi:
> endbr64
> movl g(%rip), %eax
> movl %eax, (%rsi)
> movl a(%rip), %eax
> addl %edi, %eax
> ret
>
>
> My naive intuition was x86 TSO would require a fence before
> load(seq_cst) for a prior store, even if that store was non
> atomic, so
> ensure load didn't bubble up ahead of store.
>
> Perhaps this begs the general question of intermixing non atomic
> accesses with atomics and if that is undefined behavior or some
> such. I
> skimmed through C++14 specification chapter Atomic Operations library
> but nothing's jumping out on the topic.
>
> Or is it much deeper, related to As-if rule or something.
>
> Thx,
> -Vineet
>
next prev parent reply other threads:[~2022-10-13 21:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-13 19:31 Vineet Gupta
2022-10-13 20:15 ` Jonathan Wakely
2022-10-13 20:30 ` Uros Bizjak
2022-10-13 21:14 ` Vineet Gupta
2022-10-13 21:29 ` Uros Bizjak
2022-10-13 20:54 ` Hans Boehm
2022-10-13 21:11 ` Vineet Gupta [this message]
2022-10-13 21:43 ` Hans Boehm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9838fc0f-a519-bfd3-3eb2-88255742ae57@rivosinc.com \
--to=vineetg@rivosinc.com \
--cc=gcc@gcc.gnu.org \
--cc=hboehm@google.com \
--cc=hongyu.wang@intel.com \
--cc=tech-unprivileged@lists.riscv.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).