From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id AD3ED3858CDA; Thu, 14 Sep 2023 13:17:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AD3ED3858CDA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694697475; bh=fABkvhl6a4eHR7spfn6bG/TvXGxrbPZyUo0jP1KZ9as=; h=From:To:Subject:Date:From; b=EuyoNdFUBekszigQaSbi0na1eCXedA57rzVKF0fz2+Vje7fm7KGhmMCFprrrhWQKu aV2yxhbxrLvGQ0mQIQbXbgpdYpZft8wYbm0obaLM0aY7dEpXw4ctjrSXT2v+qSxgw+ C3x0aWExSPDtfOksnzPWKSwTNl1U+kEohmfLx1fI= From: "luke.geeson at cs dot ucl.ac.uk" To: gcc-bugs@gcc.gnu.org Subject: [Bug translation/111416] New: [Armv7/v8 Mixing Bug]: 64-bit Sequentially Consistent Load can be Reordered before Store of RMW when v7 and v7 Implementations are Mixed Date: Thu, 14 Sep 2023 13:17:55 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: translation X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: luke.geeson at cs dot ucl.ac.uk X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111416 Bug ID: 111416 Summary: [Armv7/v8 Mixing Bug]: 64-bit Sequentially Consistent Load can be Reordered before Store of RMW when v7 and v7 Implementations are Mixed Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: translation Assignee: unassigned at gcc dot gnu.org Reporter: luke.geeson at cs dot ucl.ac.uk Target Milestone: --- Consider the following litmus test that has buggy behaviour: ``` C test { int64_t x =3D 0; int64_t y =3D 0 } P0 (_Atomic int64_t *x, _Atomic int64_t *y) { atomic_fetch_add_explicit(x,1,memory_order_seq_cst); int32_t r0 =3D atomic_load_explicit(y,memory_order_seq_cst); } void P1 (_Atomic int64_t *x, _Atomic int64_t *y) { atomic_store_explicit(y,1,memory_order_seq_cst); int32_t r0 =3D atomic_load_explicit(x,memory_order_seq_cst); } exists P0:r0 =3D 0 /\ P1:r0 =3D 0 ``` where 'P0:r0 =3D 0' means thread P0, local variable r0 has value 0 When simulating this test under the C/C++ model from its initial state, the outcome of execution in the exists clause is forbidden by the source model.= The allowed outcomes are: ``` { P0:r0=3D0; P1:r0=3D1; } { P0:r0=3D1; P1:r0=3D0; } { P0:r0=3D1; P1:r0=3D1; } ``` When compiling P1, to target armv7-a cortex-a53 (https://godbolt.org/z/efGnsa19G) using clang trunk, compiling the fetch_ad= d on P0 to target a cortex-a53 using clang trunk (`ldaexd;add;stlexd` loop), and= the load on P0 to target a cortex-a15 (`ldrd;dmb`) using GCC trunk for cortex-a= 15. The compiled program has the following outcomes when simulated under the aarch32 model: ``` { P0:r0=3D0; P1:r0=3D0; } <--- Forbidden by source model, bug! { P0:r0=3D0; P1:r0=3D1; } { P0:r0=3D1; P1:r0=3D0; } { P0:r0=3D1; P1:r0=3D1; } ``` which is due to the fact the LDRD on P0 can be reordered befofre the stlexd= on P0 since there is no dmb barrier to prevent the reordering. Since there is no acquire load on armv7, we propose to fix the bug by addin= g a fence before the ldrd: ``` dmb ish; ldrd; dmb ish ``` Which prevents the buggy outcome under the aarch32 memory model. I have validated this bug whilst discussing with Wilco from Arm's compiler teams. This bug would not have been caught in normal execution, but only when mult= iple implementations are mixed together.=