From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <xry111@xry111.site>
Received: from xry111.site (xry111.site [89.208.246.23])
	by sourceware.org (Postfix) with ESMTPS id 64F2F3851897
	for <gcc-patches@gcc.gnu.org>; Mon, 14 Nov 2022 08:19:52 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 64F2F3851897
Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site;
	s=default; t=1668413991;
	bh=l5KM7AbgEtx6756vqJiLEkIMCoPDN2t6eHVtTSusv4A=;
	h=Subject:From:To:Cc:Date:In-Reply-To:References:From;
	b=bPNLGntTGCATg0Y47LGKgCylur7vSqEQNqx/9Gol4yfIHy+ahp7vQVjKKPeXM4U47
	 X/4yT53TFKR0bhrUkvEFSq9lY2srJn9qSo1DVs8DDPngoAqK44jVyM818/UjKcs5F6
	 RkfeHeTQXRzBCf62htH/LoHuFDfMNwIpDl6GAwZY=
Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange ECDHE (P-256) server-signature ECDSA (P-384))
	(Client did not present a certificate)
	(Authenticated sender: xry111@xry111.site)
	by xry111.site (Postfix) with ESMTPSA id 95C63667B7;
	Mon, 14 Nov 2022 03:19:49 -0500 (EST)
Message-ID: <0fa5e4e5ce325a8e432e9e0bd2e598aa48666501.camel@xry111.site>
Subject: Re: [PATCH] libatomic: Handle AVX+CX16 AMD like Intel for 16b
 atomics [PR104688]
From: Xi Ruoyao <xry111@xry111.site>
To: Uros Bizjak <ubizjak@gmail.com>, Jakub Jelinek <jakub@redhat.com>, 
	Mayshao-oc <Mayshao-oc@zhaoxin.com>
Cc: Richard Biener <rguenther@suse.de>, Jeff Law <jeffreyalaw@gmail.com>, 
	gcc-patches@gcc.gnu.org, Florian Weimer <fweimer@redhat.com>, "H.J. Lu"
	 <hjl.tools@gmail.com>
Date: Mon, 14 Nov 2022 16:19:48 +0800
In-Reply-To: <CAFULd4avaPhry66MvWmZDqnPn4ShKbvmPw6K_VQYuVpae5pe8A@mail.gmail.com>
References: <Y3Hy1ckL3ZluEOSi@tucnak>
	 <CAFULd4avaPhry66MvWmZDqnPn4ShKbvmPw6K_VQYuVpae5pe8A@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
User-Agent: Evolution 3.46.0 
MIME-Version: 1.0
X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_SUSPICIOUS_NTLD,KAM_SHORT,LIKELY_SPAM_FROM,PDS_OTHER_BAD_TLD,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Mon, 2022-11-14 at 08:55 +0100, Uros Bizjak via Gcc-patches wrote:
> On Mon, Nov 14, 2022 at 8:48 AM Jakub Jelinek <jakub@redhat.com>
> wrote:
> >=20
> > Hi!
> >=20
> > Working virtually out of Baker Island.
> >=20
> > We got a response from AMD in
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104688#c10
> > so the following patch starts treating AMD with AVX and CMPXCHG16B
> > ISAs like Intel by using vmovdqa for atomic load/store in libatomic.
> >=20
> > Ok for trunk if it passes bootstrap/regtest?
> >=20
> > 2022-11-13=C2=A0 Jakub Jelinek=C2=A0 <jakub@redhat.com>
> >=20
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 PR target/104688
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * config/x86/init.c (__libat=
_feat1_init): Revert 2022-03-17
> > change
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - on x86_64 no longer clear =
bit_AVX if CPU vendor is not
> > Intel.
> >=20
> > --- libatomic/config/x86/init.c.jj=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2022-0=
3-17
> > 18:48:56.708723194 +0100
> > +++ libatomic/config/x86/init.c 2022-11-13 18:23:26.315440071 -1200
> > @@ -34,18 +34,6 @@ __libat_feat1_init (void)
> > =C2=A0=C2=A0 unsigned int eax, ebx, ecx, edx;
> > =C2=A0=C2=A0 FEAT1_REGISTER =3D 0;
> > =C2=A0=C2=A0 __get_cpuid (1, &eax, &ebx, &ecx, &edx);
> > -#ifdef __x86_64__
> > -=C2=A0 if ((FEAT1_REGISTER & (bit_AVX | bit_CMPXCHG16B))
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D=3D (bit_AVX | bit_CMPXCHG16B))
> > -=C2=A0=C2=A0=C2=A0 {
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Intel SDM guarantees that 16-byte VM=
OVDQA on 16-byte
> > aligned address
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 is atomic, but so far we do=
n't have this guarantee from
> > AMD.=C2=A0 */
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned int ecx2 =3D 0;
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __get_cpuid (0, &eax, &ebx, &ecx2, &edx=
);
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (ecx2 !=3D signature_INTEL_ecx)
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 FEAT1_REGISTER &=3D ~bit_AVX;
>=20
> We still need this, but also bypass it for AMD signature. There are
> other vendors than Intel and AMD.

Mayshao: how about the status of this feature on Zhaoxin product lines?
IIRC they support AVX (but disabled by default in GCC for Lujiazui), but
we don't know if they make the guarantee about atomicity of 16B aligned
access.

>=20
> OK with the above addition.
>=20
> Thanks,
> Uros.
>=20
> > -=C2=A0=C2=A0=C2=A0 }
> > -#endif
> > =C2=A0=C2=A0 /* See the load in load_feat1.=C2=A0 */
> > =C2=A0=C2=A0 __atomic_store_n (&__libat_feat1, FEAT1_REGISTER,
> > __ATOMIC_RELAXED);
> > =C2=A0=C2=A0 return FEAT1_REGISTER;
> >=20
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Jakub
> >=20

--=20
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University