From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb30.google.com (mail-yb1-xb30.google.com [IPv6:2607:f8b0:4864:20::b30]) by sourceware.org (Postfix) with ESMTPS id DDCE23858C55 for ; Thu, 13 Oct 2022 20:30:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DDCE23858C55 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb30.google.com with SMTP id j7so3377777ybb.8 for ; Thu, 13 Oct 2022 13:30:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=o+Cv6Cad79hT6oWzjkPrSyiqv4m62kY67UyXpQ92RHk=; b=A1pU4TzzmNTc31lG+SNyvZRBhrJFxIU8+qQku+2+VdhnudFjACpj+5NkqGLoqrlO6I fDYfF7m+fvOY3du/jOYS32OVudvtab1DfyBDuM5stpGk2NOOPPJylsxBDpUCllKMNJOn Ds/Xk075ovUZk4VgLDCDSlhjTB4FHNExSanjt81bJrnN3l+9Y7ZqCEVz9lqgrKvZa+Mx 1V2O4H9rY7QtrUiqzZESE/xgWC1uuhygKnTEBij7ZLMJ3SGRn4l2h8+8nVl8x+0xNJXC Jka/JiyWn82iDAfkJ7IIxBZwnJxOizddYlG7tcS7TAuChlmLPcibY4VO8cRx2lIFjK/e yTfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o+Cv6Cad79hT6oWzjkPrSyiqv4m62kY67UyXpQ92RHk=; b=qTI/KDPDnIrzwNXJOTI+HbohkasboxtuswltWSYJGE1gB/vU20X4P184yoU4u0eR6/ a31OdBaOz/zIOMhqM6Bhm3Rug1NeboXOoxbwfyFWERgYldb079AiGlhk2QDopjNXSQNd ufYrGLM2uMdmG0HJ8FghWWEv/hUw7I8BN3wn4xb8XCu3MmQKj2/nfVYuRc3gggmPxlBT 7NEm5NQMuGAl+H2FN25aHjMc0oyT7EeH8cFUzLWS+NUCb2+cxnAJbzMuqjHfCLbce5mY pxIY9kJMjiurVN9Vs5j5wp78xwUI41RBGCUsn9tjnOcwkUeyRuYjMOwRw/xUJdhv7tAP fNRg== X-Gm-Message-State: ACrzQf3/chLxdmLcv768HRadrfAwpHe2YIe2oVuQPnrnYQcvrUip1J9q 4ogvMI/jx0Vm75tyBmQk+m4lYO+3lP+5Xil0b38= X-Google-Smtp-Source: AMsMyM5FhsOWxxAIXcthNxffAjaUaKJPyt5C3hQc2ytJzXqLFYE0vSAlXT0MH1NZOCe/RSMjRyDfFnrU3/MC+GIoGJk= X-Received: by 2002:a25:76d4:0:b0:6c1:5908:a625 with SMTP id r203-20020a2576d4000000b006c15908a625mr1615767ybc.60.1665693026211; Thu, 13 Oct 2022 13:30:26 -0700 (PDT) MIME-Version: 1.0 References: <8c7380d2-2587-78c7-a85a-a4c8afef2284@rivosinc.com> In-Reply-To: <8c7380d2-2587-78c7-a85a-a4c8afef2284@rivosinc.com> From: Uros Bizjak Date: Thu, 13 Oct 2022 22:30:15 +0200 Message-ID: Subject: Re: Fences/Barriers when mixing C++ atomics and non-atomics To: Vineet Gupta Cc: tech-unprivileged@lists.riscv.org, gcc@gcc.gnu.org, Hans Boehm , Hongyu Wang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Oct 13, 2022 at 9:31 PM Vineet Gupta wrote: > > Hi, > > I have a testcase (from real workloads) involving C++ atomics and trying > to understand the codegen (gcc 12) for RVWMO and x86. > It does mix atomics with non-atomics so not obvious what the behavior is > intended to be hence some explicit CC of subject matter experts > (apologies for that in advance). > > Test has a non-atomic store followed by an atomic_load(SEQ_CST). I > assume that unadorned direct access defaults to safest/conservative seq_c= st. > > extern int g; > std::atomic a; > > int bar_noaccessor(int n, int *n2) > { > *n2 =3D g; > return n + a; > } > > int bar_seqcst(int n, int *n2) > { > *n2 =3D g; > return n + a.load(std::memory_order_seq_cst); > } > > On RV (rvwmo), with current gcc 12 we get 2 full fences around the load > as prescribed by Privileged Spec, Chpater A, Table A.6 (Mappings from > C/C++ to RISC-V primitives). > > _Z10bar_seqcstiPi: > .LFB382: > .cfi_startproc > lui a5,%hi(g) > lw a5,%lo(g)(a5) > sw a5,0(a1) > *fence iorw,iorw* > lui a5,%hi(a) > lw a5,%lo(a)(a5) > *fence iorw,iorw* > addw a0,a5,a0 > ret > > > OTOH, for x86 (same default toggles) there's no barriers at all. > > _Z10bar_seqcstiPi: > endbr64 > movl g(%rip), %eax > movl %eax, (%rsi) > movl a(%rip), %eax > addl %edi, %eax > ret > Regarding x86 memory model, please see Intel=C2=AE 64 and IA-32 Architectur= es Software Developer=E2=80=99s Manual, Volume 3A, section 8.2 [1] [1] https://www.intel.com/content/www/us/en/developer/articles/technical/in= tel-sdm.html > My naive intuition was x86 TSO would require a fence before > load(seq_cst) for a prior store, even if that store was non atomic, so > ensure load didn't bubble up ahead of store. As documented in the SDM above, the x86 memory model guarantees that =E2=80=A2 Reads are not reordered with other reads. =E2=80=A2 Writes are not reordered with older reads. =E2=80=A2 Writes to memory are not reordered with other writes, with the following exceptions: ... =E2=80=A2 Reads may be reordered with older writes to different locations b= ut not with older writes to the same location. ... Uros. > Perhaps this begs the general question of intermixing non atomic > accesses with atomics and if that is undefined behavior or some such. I > skimmed through C++14 specification chapter Atomic Operations library > but nothing's jumping out on the topic. > > Or is it much deeper, related to As-if rule or something. > > Thx, > -Vineet