From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by sourceware.org (Postfix) with ESMTPS id 38FF73858D3C for ; Mon, 25 Sep 2023 20:54:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 38FF73858D3C Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=googlemail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=googlemail.com Received: by mail-wm1-x333.google.com with SMTP id 5b1f17b1804b1-3ff1c397405so83667795e9.3 for ; Mon, 25 Sep 2023 13:54:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20230601; t=1695675276; x=1696280076; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=B5RbTvQda3sWe/5wO3CeqDdrQ/Rjn4TuHxflKTKerWE=; b=LK9jehTUQF7xXuO+VFRNGlF/rsVFG6/DV+0kvGqyPB2UgXuyQyeGlt8O+2ZVxamKHv wGVuHZ++n8ypzklJMf8Ftn0q45Um0nTq95kXEVWzQDXFKsSgcnelrR2CBl+K2A6F3I56 iM3oZQXTxKha6ixH9lgh7rhdItRasgdm1DsJ0KIT9+Dy3ZejeP/YPNbgfbZdi+nbeNd8 pCsAZc3oXqXkIoRopxVD+XkLYAkpOEJDGStalmgtv9h/QMGW4esQPqqdifGWdG5xa/VX rgfD7aDItIV4/6GE2XRRmNi2QEFxI25XpcUinkpH0zTEou4Bc+oho3JcB9EslPYY30qW f3HA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695675276; x=1696280076; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=B5RbTvQda3sWe/5wO3CeqDdrQ/Rjn4TuHxflKTKerWE=; b=OWnO5fwS1Cl0FaVOH8K91/g6QoCqBt0P3DebR9mXVjNWvKW4dMPvDbX4QRUsH2EVVS B73HrYe2CorP9wmYscVUAHo8XSvBLuSEZvJcgHkNA9DZ+iZ6rSOUMpYJvCQapvcO5mbC u8Pzs1OOpiQWBJCZx3qIKU9ja7cHZrXeNMMq4aqxQorHRHUYtUyFu8u/eihgTQhp9R23 gH0gMwbuQd+eSAEjMnaqOa6DPTD1ysgbJ8ieYMvNz1zMlCaUBKca/40B9pLBC+UBo0WU l9f19IzM+STku1pZjiqC6t2LoVLuqla8/YJKNB2x9CiFaKfIymEiO4ydcvDXu4aGihuP AMQw== X-Gm-Message-State: AOJu0YyNfUnIcYm86Eqv5v2r4CSfVsUwuSl6wO52LvAH6eZXDS1FPJ0y FzJ7t/9ppHTvtSFymksBaTowf+IOLo+9lKkXtihDmJ5ikq7CXQ== X-Google-Smtp-Source: AGHT+IF2sk0bPFCdRjZPc1pAbAjm/8fc5AeRD/mkQSqseJL8CMDtA9QhIAEIhoTH1uet/F+nU1tXLe/Ro/ClRGemds8= X-Received: by 2002:a7b:c85a:0:b0:3fe:2011:a7ce with SMTP id c26-20020a7bc85a000000b003fe2011a7cemr6859156wml.6.1695675275626; Mon, 25 Sep 2023 13:54:35 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ramana Radhakrishnan Date: Mon, 25 Sep 2023 21:54:24 +0100 Message-ID: Subject: Re: [PATCH] AArch64: Fix __sync_val_compare_and_swap [PR111404] To: Wilco Dijkstra Cc: GCC Patches , Richard Sandiford , Kyrylo Tkachov , Richard Earnshaw Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Sep 13, 2023 at 3:55=E2=80=AFPM Wilco Dijkstra via Gcc-patches wrote: > > > __sync_val_compare_and_swap may be used on 128-bit types and either calls= the > outline atomic code or uses an inline loop. On AArch64 LDXP is only atom= ic if > the value is stored successfully using STXP, but the current implementati= ons > do not perform the store if the comparison fails. In this case the value= returned > is not read atomically. IIRC, the previous discussions in this space revolved around the difficulty with the store writing to readonly memory which is why I think we went with LDXP in this form. Has something changed from then ? Reviewed-by : Ramana Radhakrishnan regards Ramana > > Passes regress/bootstrap, OK for commit? > > gcc/ChangeLog/ > PR target/111404 > * config/aarch64/aarch64.cc (aarch64_split_compare_and_swap): > For 128-bit store the loaded value and loop if needed. > > libgcc/ChangeLog/ > PR target/111404 > * config/aarch64/lse.S (__aarch64_cas16_acq_rel): Execute STLXP u= sing > either new value or loaded value. > > --- > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.c= c > index 5e8d0a0c91bc7719de2a8c5627b354cf905a4db0..c44c0b979d0cc3755c61dcf56= 6cfddedccebf1ea 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -23413,11 +23413,11 @@ aarch64_split_compare_and_swap (rtx operands[]) > mem =3D operands[1]; > oldval =3D operands[2]; > newval =3D operands[3]; > - is_weak =3D (operands[4] !=3D const0_rtx); > model_rtx =3D operands[5]; > scratch =3D operands[7]; > mode =3D GET_MODE (mem); > model =3D memmodel_from_int (INTVAL (model_rtx)); > + is_weak =3D operands[4] !=3D const0_rtx && mode !=3D TImode; > > /* When OLDVAL is zero and we want the strong version we can emit a ti= ghter > loop: > @@ -23478,6 +23478,33 @@ aarch64_split_compare_and_swap (rtx operands[]) > else > aarch64_gen_compare_reg (NE, scratch, const0_rtx); > > + /* 128-bit LDAXP is not atomic unless STLXP succeeds. So for a mismat= ch, > + store the returned value and loop if the STLXP fails. */ > + if (mode =3D=3D TImode) > + { > + rtx_code_label *label3 =3D gen_label_rtx (); > + emit_jump_insn (gen_rtx_SET (pc_rtx, gen_rtx_LABEL_REF (Pmode, lab= el3))); > + emit_barrier (); > + > + emit_label (label2); > + aarch64_emit_store_exclusive (mode, scratch, mem, rval, model_rtx)= ; > + > + if (aarch64_track_speculation) > + { > + /* Emit an explicit compare instruction, so that we can correct= ly > + track the condition codes. */ > + rtx cc_reg =3D aarch64_gen_compare_reg (NE, scratch, const0_rtx= ); > + x =3D gen_rtx_NE (GET_MODE (cc_reg), cc_reg, const0_rtx); > + } > + else > + x =3D gen_rtx_NE (VOIDmode, scratch, const0_rtx); > + x =3D gen_rtx_IF_THEN_ELSE (VOIDmode, x, > + gen_rtx_LABEL_REF (Pmode, label1), pc_rtx= ); > + aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x)); > + > + label2 =3D label3; > + } > + > emit_label (label2); > > /* If we used a CBNZ in the exchange loop emit an explicit compare wit= h RVAL > diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S > index dde3a28e07b13669533dfc5e8fac0a9a6ac33dbd..ba05047ff02b6fc5752235bff= a924fc4a2f48c04 100644 > --- a/libgcc/config/aarch64/lse.S > +++ b/libgcc/config/aarch64/lse.S > @@ -160,6 +160,8 @@ see the files COPYING3 and COPYING.RUNTIME respective= ly. If not, see > #define tmp0 16 > #define tmp1 17 > #define tmp2 15 > +#define tmp3 14 > +#define tmp4 13 > > #define BTI_C hint 34 > > @@ -233,10 +235,11 @@ STARTFN NAME(cas) > 0: LDXP x0, x1, [x4] > cmp x0, x(tmp0) > ccmp x1, x(tmp1), #0, eq > - bne 1f > - STXP w(tmp2), x2, x3, [x4] > - cbnz w(tmp2), 0b > -1: BARRIER > + csel x(tmp2), x2, x0, eq > + csel x(tmp3), x3, x1, eq > + STXP w(tmp4), x(tmp2), x(tmp3), [x4] > + cbnz w(tmp4), 0b > + BARRIER > ret > > #endif >