From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2051.outbound.protection.outlook.com [40.107.8.51]) by sourceware.org (Postfix) with ESMTPS id 7A7B43858D3C for ; Fri, 2 Jun 2023 17:28:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7A7B43858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Fw1uzx9IBVsoYdyegRwxMzHzjsWteYZPQZB/5CTJ5q8=; b=MRao/zzc0q9lnq9jzRsf5RUDvPdjqCC7TMPKeqF0bdCKTa2X3OSi5RMA73ZcvWZvnT91LGt6Fus+IPIpE09R9BAvh4sO1DAIT+fs5budH7O1x/KlS/FrelMlyQ0hXCc5pyGY0Fc2Kb6OqDalyukSq9jqqmOOtwF2223RnNGyeK8= Received: from AS8PR05CA0018.eurprd05.prod.outlook.com (2603:10a6:20b:311::23) by DU0PR08MB9608.eurprd08.prod.outlook.com (2603:10a6:10:448::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.22; Fri, 2 Jun 2023 17:28:26 +0000 Received: from AM7EUR03FT012.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:311:cafe::57) by AS8PR05CA0018.outlook.office365.com (2603:10a6:20b:311::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.24 via Frontend Transport; Fri, 2 Jun 2023 17:28:26 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT012.mail.protection.outlook.com (100.127.141.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.23 via Frontend Transport; Fri, 2 Jun 2023 17:28:26 +0000 Received: ("Tessian outbound 5bb4c51d5a1f:v136"); Fri, 02 Jun 2023 17:28:26 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: dbb3c8edefbba578 X-CR-MTA-TID: 64aa7808 Received: from 05d8c8646368.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 3DF82A10-5E10-4DB5-B319-C431873D5021.1; Fri, 02 Jun 2023 17:28:15 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 05d8c8646368.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 02 Jun 2023 17:28:15 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=h6jxf+xw6EsM/08+i+Rh4OQWkfGxRw7GTxskj0gsY6wIa4EBVzLjof7kIFulJuew0QJ1qIIL37Q3/lZJDpB4jn1dkg8F3GFw83D40pGDtkLJFf5Ph5emRVdvVyOZ2fQrCArCfoZwcuPVUVgtQi1Wg8BnFSkO21VUz1pv+sqs7yA0047LiLcmSQ+g43lGQtuN5Cx+P+1x8Xvj2J17F/8PdnW80dvPmbJpxstdfUsa4WBo/e/NKZlEYG5pJ27YXf9tKbHn1k+bsojvPgP4uDS0NWyolYsYyjQa1Q9agmSGdB9Plte+kC961/n9pI/gRQnEITlqNypJKUfcynGgg5wotQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Fw1uzx9IBVsoYdyegRwxMzHzjsWteYZPQZB/5CTJ5q8=; b=LTK86a/V5vQXh/gk1thoLoCUGLyOicHxa35Gw6fMioVxyWtDnXyGmRWR7ohP774W0UixmMBVYVbeJSxobVaFeJpVECbdK7Oh+8Vk61O50F+yHHUWpK3nN4huPTflFBtCWLUiEjW3vO7aeyFsAx9YZfZTm87jg631k9zUOhtRYKX2ibnK12N84tsxKeMHIHYer8VXKQ/GqbcbqpFc4xrNVzRiLgarRNC02UJ6ouLzwXhrQomg3JvQz1DkDUxGoZYbziuUjsK2/2mTwkuepNl+FuYwCHmamaUzHoxIQvNjfqNX5WKMOYRiGawnp2v+DOKrJ3N+biwQxVqCEEadaRfT3A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Fw1uzx9IBVsoYdyegRwxMzHzjsWteYZPQZB/5CTJ5q8=; b=MRao/zzc0q9lnq9jzRsf5RUDvPdjqCC7TMPKeqF0bdCKTa2X3OSi5RMA73ZcvWZvnT91LGt6Fus+IPIpE09R9BAvh4sO1DAIT+fs5budH7O1x/KlS/FrelMlyQ0hXCc5pyGY0Fc2Kb6OqDalyukSq9jqqmOOtwF2223RnNGyeK8= Received: from PAWPR08MB8982.eurprd08.prod.outlook.com (2603:10a6:102:33f::20) by AS4PR08MB7781.eurprd08.prod.outlook.com (2603:10a6:20b:515::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6433.26; Fri, 2 Jun 2023 17:28:12 +0000 Received: from PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::13be:967d:6e80:432f]) by PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::13be:967d:6e80:432f%3]) with mapi id 15.20.6455.020; Fri, 2 Jun 2023 17:28:12 +0000 From: Wilco Dijkstra To: GCC Patches CC: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH] libatomic: Enable lock-free 128-bit atomics on AArch64 [PR110061] Thread-Topic: [PATCH] libatomic: Enable lock-free 128-bit atomics on AArch64 [PR110061] Thread-Index: AQHZlXVPCtC3Z5wAcEqXuUSCuN2uuw== Date: Fri, 2 Jun 2023 17:28:10 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAWPR08MB8982:EE_|AS4PR08MB7781:EE_|AM7EUR03FT012:EE_|DU0PR08MB9608:EE_ X-MS-Office365-Filtering-Correlation-Id: ed733385-49ae-4bb2-8686-08db638ec63e x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: kTJWQzBgSFDY5rhnz+bQm532gApmOZbJHeO/+gQ0hrXac7exheBLix+eUmu+7XEq/H2FcMgruRa1fAzwubg1NfXzTbTaDUvoIbCmWla1VngdHQAnpeXD5IdB2/TAJyvWzEn5pbm9ZCgt95NlnvpjHRauBOi087rMAnvek/C0JjQW9dhGuhHFcGNpgd1OhRtesD5f/liKEel3h6+65ObYTgyKr9fddulR74hgZk+Lpw23iHDHfWE30jXPy9x682DaSurO050BN5o+QKjGT1QZIterYIymr4VFefTCQhQMQg2vHDXdDJzxwKAxlP5bljAk1i+R3h91BpDnwAdt4G+9uFsaiMvdmkkAen4qYUC3d6qjDBfZThfQVa3NK1O+sPQdrzGQ2QDT4R043ek0icPpPg1Ve8aS+QYxlEU+3BLRqyCLrUrqK1kHqeRpMpWcGaJMaxBxpL7gyR4iz2wvono28Uo7d1NGZRUkLTYLZrKFiHlYlE9WzBc8asHC+1ckpor9LBxf0HKzCsm96qTzjQUYkJpepO32/yb0LauA12zRQqkzUIHw5H2H1TBPq2l3MjOGCLjw8dX4oGdHKDy+CAq7qMPOXKYu++DVwY4yLnmnjc4= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAWPR08MB8982.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(136003)(39860400002)(366004)(346002)(376002)(396003)(451199021)(33656002)(55016003)(83380400001)(66556008)(66446008)(66476007)(91956017)(76116006)(9686003)(6916009)(6506007)(64756008)(26005)(7696005)(71200400001)(54906003)(86362001)(186003)(66946007)(478600001)(8676002)(52536014)(5660300002)(8936002)(2906002)(41300700001)(4326008)(122000001)(38100700002)(38070700005)(316002);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS4PR08MB7781 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: e9959958-dba5-446e-1d81-08db638ebcaf X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 35HHMF89/t4IHYhGVZ2lXxuADaMTt8mzx6LHSF2UfK/YK9x6qdtlyqyfIPAceNoKTlw+U/kfzPbS6ZxEfo8+U8FiiLjeakwKVz68hzAoTzFk6YAptRTUcOgRklslWuiP9kfraV92ISyLKgEUYgSuq+ncZCo5zbYmlLcmRb6O1oI7wNyz8VaHg8yTUq1CKceQIFf0MfZr5wgJX/Ked/j5F6xYLiwRbLRimcgqSL5RmR9vadsoqZ7bVSzNWv0l4ox8603dxfATujOMca4sGwsDmJh5wbQPzXam4IWAVlSfA6ChGQuSdk9tKG++NKtAaQk/lmOgKrf7s4Ufi9aIjSq9CGLjslVL/GxYZMFLtNhlKCqbzey4Fju+RkJq00rpAnrmf1guFAJnTFCOlT6Ke1xImDeqs6JfMiFuu/RjGoG/HiK322qF7VUIHp0kbaN76vWgiJGisj/w1urajr2KSMY74CFgC3DVDjcjKv9UIypLxS7wOgkfmn2CcfC7th3g084JL0EjMhSLk07hJEVK4y/Qe9AzfK9DL2noDNkLsJWhNAwojVZw3Vy0vbVI1vHGIkjBBvhBBnx59YFAhKbmD30ffFUrXelJZthf7hxBP6ujvRhxpJCcM20eLebfLhfr0pH8fFl4WnAnhvSeDiP6TsUrpGgX6RZ6qweCx5xgH7qFg/PyN6I7XbzdpqOkEekRNR4XbZ/HtEURpKWFQ9lLRMREr6bDygpIiOmaDWqzpBONQt8= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(376002)(396003)(346002)(136003)(451199021)(36840700001)(40470700004)(46966006)(40460700003)(7696005)(36860700001)(47076005)(356005)(336012)(82310400005)(86362001)(83380400001)(81166007)(9686003)(6506007)(33656002)(82740400003)(26005)(186003)(40480700001)(55016003)(70206006)(70586007)(6916009)(2906002)(4326008)(54906003)(316002)(8676002)(8936002)(41300700001)(52536014)(478600001)(5660300002);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Jun 2023 17:28:26.4768 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ed733385-49ae-4bb2-8686-08db638ec63e X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9608 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: =0A= Enable lock-free 128-bit atomics on AArch64. This is backwards compatible = with=0A= existing binaries, gives better performance than locking atomics and is wha= t=0A= most users expect.=0A= =0A= Note 128-bit atomic loads use a load/store exclusive loop if LSE2 is not su= pported.=0A= This results in an implicit store which is invisible to software as long as= the given=0A= address is writeable (which will be true when using atomics in actual code)= .=0A= =0A= A simple test on an old Cortex-A72 showed 2.7x speedup of 128-bit atomics.= =0A= =0A= Passes regress, OK for commit?=0A= =0A= libatomic/=0A= PR target/110061=0A= config/linux/aarch64/atomic_16.S: Implement lock-free ARMv8.0 atomi= cs.=0A= config/linux/aarch64/host-config.h: Use atomic_16.S for baseline v8= .0.=0A= State we have lock-free atomics.=0A= =0A= ---=0A= =0A= diff --git a/libatomic/config/linux/aarch64/atomic_16.S b/libatomic/config/= linux/aarch64/atomic_16.S=0A= index 05439ce394b9653c9bcb582761ff7aaa7c8f9643..0485c284117edf54f41959d2fab= 9341a9567b1cf 100644=0A= --- a/libatomic/config/linux/aarch64/atomic_16.S=0A= +++ b/libatomic/config/linux/aarch64/atomic_16.S=0A= @@ -22,6 +22,21 @@=0A= . */=0A= =0A= =0A= +/* AArch64 128-bit lock-free atomic implementation.=0A= +=0A= + 128-bit atomics are now lock-free for all AArch64 architecture versions= .=0A= + This is backwards compatible with existing binaries and gives better=0A= + performance than locking atomics.=0A= +=0A= + 128-bit atomic loads use a exclusive loop if LSE2 is not supported.=0A= + This results in an implicit store which is invisible to software as lon= g=0A= + as the given address is writeable. Since all other atomics have explic= it=0A= + writes, this will be true when using atomics in actual code.=0A= +=0A= + The libat__16 entry points are ARMv8.0.=0A= + The libat__16_i1 entry points are used when LSE2 is available. */= =0A= +=0A= +=0A= .arch armv8-a+lse=0A= =0A= #define ENTRY(name) \=0A= @@ -37,6 +52,10 @@ name: \=0A= .cfi_endproc; \=0A= .size name, .-name;=0A= =0A= +#define ALIAS(alias,name) \=0A= + .global alias; \=0A= + .set alias, name;=0A= +=0A= #define res0 x0=0A= #define res1 x1=0A= #define in0 x2=0A= @@ -70,6 +89,24 @@ name: \=0A= #define SEQ_CST 5=0A= =0A= =0A= +ENTRY (libat_load_16)=0A= + mov x5, x0=0A= + cbnz w1, 2f=0A= +=0A= + /* RELAXED. */=0A= +1: ldxp res0, res1, [x5]=0A= + stxp w4, res0, res1, [x5]=0A= + cbnz w4, 1b=0A= + ret=0A= +=0A= + /* ACQUIRE/CONSUME/SEQ_CST. */=0A= +2: ldaxp res0, res1, [x5]=0A= + stxp w4, res0, res1, [x5]=0A= + cbnz w4, 2b=0A= + ret=0A= +END (libat_load_16)=0A= +=0A= +=0A= ENTRY (libat_load_16_i1)=0A= cbnz w1, 1f=0A= =0A= @@ -93,6 +130,23 @@ ENTRY (libat_load_16_i1)=0A= END (libat_load_16_i1)=0A= =0A= =0A= +ENTRY (libat_store_16)=0A= + cbnz w4, 2f=0A= +=0A= + /* RELAXED. */=0A= +1: ldxp xzr, tmp0, [x0]=0A= + stxp w4, in0, in1, [x0]=0A= + cbnz w4, 1b=0A= + ret=0A= +=0A= + /* RELEASE/SEQ_CST. */=0A= +2: ldxp xzr, tmp0, [x0]=0A= + stlxp w4, in0, in1, [x0]=0A= + cbnz w4, 2b=0A= + ret=0A= +END (libat_store_16)=0A= +=0A= +=0A= ENTRY (libat_store_16_i1)=0A= cbnz w4, 1f=0A= =0A= @@ -101,14 +155,14 @@ ENTRY (libat_store_16_i1)=0A= ret=0A= =0A= /* RELEASE/SEQ_CST. */=0A= -1: ldaxp xzr, tmp0, [x0]=0A= +1: ldxp xzr, tmp0, [x0]=0A= stlxp w4, in0, in1, [x0]=0A= cbnz w4, 1b=0A= ret=0A= END (libat_store_16_i1)=0A= =0A= =0A= -ENTRY (libat_exchange_16_i1)=0A= +ENTRY (libat_exchange_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -126,22 +180,55 @@ ENTRY (libat_exchange_16_i1)=0A= stxp w4, in0, in1, [x5]=0A= cbnz w4, 3b=0A= ret=0A= -4:=0A= - cmp w4, RELEASE=0A= - b.ne 6f=0A= =0A= - /* RELEASE. */=0A= -5: ldxp res0, res1, [x5]=0A= + /* RELEASE/ACQ_REL/SEQ_CST. */=0A= +4: ldaxp res0, res1, [x5]=0A= stlxp w4, in0, in1, [x5]=0A= - cbnz w4, 5b=0A= + cbnz w4, 4b=0A= ret=0A= +END (libat_exchange_16)=0A= =0A= - /* ACQ_REL/SEQ_CST. */=0A= -6: ldaxp res0, res1, [x5]=0A= - stlxp w4, in0, in1, [x5]=0A= - cbnz w4, 6b=0A= +=0A= +ENTRY (libat_compare_exchange_16)=0A= + ldp exp0, exp1, [x1]=0A= + cbz w4, 3f=0A= + cmp w4, RELEASE=0A= + b.hs 4f=0A= +=0A= + /* ACQUIRE/CONSUME. */=0A= +1: ldaxp tmp0, tmp1, [x0]=0A= + cmp tmp0, exp0=0A= + ccmp tmp1, exp1, 0, eq=0A= + bne 2f=0A= + stxp w4, in0, in1, [x0]=0A= + cbnz w4, 1b=0A= + mov x0, 1=0A= ret=0A= -END (libat_exchange_16_i1)=0A= +=0A= +2: stp tmp0, tmp1, [x1]=0A= + mov x0, 0=0A= + ret=0A= +=0A= + /* RELAXED. */=0A= +3: ldxp tmp0, tmp1, [x0]=0A= + cmp tmp0, exp0=0A= + ccmp tmp1, exp1, 0, eq=0A= + bne 2b=0A= + stxp w4, in0, in1, [x0]=0A= + cbnz w4, 3b=0A= + mov x0, 1=0A= + ret=0A= +=0A= + /* RELEASE/ACQ_REL/SEQ_CST. */=0A= +4: ldaxp tmp0, tmp1, [x0]=0A= + cmp tmp0, exp0=0A= + ccmp tmp1, exp1, 0, eq=0A= + bne 2b=0A= + stlxp w4, in0, in1, [x0]=0A= + cbnz w4, 4b=0A= + mov x0, 1=0A= + ret=0A= +END (libat_compare_exchange_16)=0A= =0A= =0A= ENTRY (libat_compare_exchange_16_i1)=0A= @@ -180,7 +267,7 @@ ENTRY (libat_compare_exchange_16_i1)=0A= END (libat_compare_exchange_16_i1)=0A= =0A= =0A= -ENTRY (libat_fetch_add_16_i1)=0A= +ENTRY (libat_fetch_add_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -199,10 +286,10 @@ ENTRY (libat_fetch_add_16_i1)=0A= stlxp w4, tmp0, tmp1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_fetch_add_16_i1)=0A= +END (libat_fetch_add_16)=0A= =0A= =0A= -ENTRY (libat_add_fetch_16_i1)=0A= +ENTRY (libat_add_fetch_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -221,10 +308,10 @@ ENTRY (libat_add_fetch_16_i1)=0A= stlxp w4, res0, res1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_add_fetch_16_i1)=0A= +END (libat_add_fetch_16)=0A= =0A= =0A= -ENTRY (libat_fetch_sub_16_i1)=0A= +ENTRY (libat_fetch_sub_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -243,10 +330,10 @@ ENTRY (libat_fetch_sub_16_i1)=0A= stlxp w4, tmp0, tmp1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_fetch_sub_16_i1)=0A= +END (libat_fetch_sub_16)=0A= =0A= =0A= -ENTRY (libat_sub_fetch_16_i1)=0A= +ENTRY (libat_sub_fetch_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -265,10 +352,10 @@ ENTRY (libat_sub_fetch_16_i1)=0A= stlxp w4, res0, res1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_sub_fetch_16_i1)=0A= +END (libat_sub_fetch_16)=0A= =0A= =0A= -ENTRY (libat_fetch_or_16_i1)=0A= +ENTRY (libat_fetch_or_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -287,10 +374,10 @@ ENTRY (libat_fetch_or_16_i1)=0A= stlxp w4, tmp0, tmp1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_fetch_or_16_i1)=0A= +END (libat_fetch_or_16)=0A= =0A= =0A= -ENTRY (libat_or_fetch_16_i1)=0A= +ENTRY (libat_or_fetch_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -309,10 +396,10 @@ ENTRY (libat_or_fetch_16_i1)=0A= stlxp w4, res0, res1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_or_fetch_16_i1)=0A= +END (libat_or_fetch_16)=0A= =0A= =0A= -ENTRY (libat_fetch_and_16_i1)=0A= +ENTRY (libat_fetch_and_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -331,10 +418,10 @@ ENTRY (libat_fetch_and_16_i1)=0A= stlxp w4, tmp0, tmp1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_fetch_and_16_i1)=0A= +END (libat_fetch_and_16)=0A= =0A= =0A= -ENTRY (libat_and_fetch_16_i1)=0A= +ENTRY (libat_and_fetch_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -353,10 +440,10 @@ ENTRY (libat_and_fetch_16_i1)=0A= stlxp w4, res0, res1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_and_fetch_16_i1)=0A= +END (libat_and_fetch_16)=0A= =0A= =0A= -ENTRY (libat_fetch_xor_16_i1)=0A= +ENTRY (libat_fetch_xor_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -375,10 +462,10 @@ ENTRY (libat_fetch_xor_16_i1)=0A= stlxp w4, tmp0, tmp1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_fetch_xor_16_i1)=0A= +END (libat_fetch_xor_16)=0A= =0A= =0A= -ENTRY (libat_xor_fetch_16_i1)=0A= +ENTRY (libat_xor_fetch_16)=0A= mov x5, x0=0A= cbnz w4, 2f=0A= =0A= @@ -397,10 +484,10 @@ ENTRY (libat_xor_fetch_16_i1)=0A= stlxp w4, res0, res1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_xor_fetch_16_i1)=0A= +END (libat_xor_fetch_16)=0A= =0A= =0A= -ENTRY (libat_fetch_nand_16_i1)=0A= +ENTRY (libat_fetch_nand_16)=0A= mov x5, x0=0A= mvn in0, in0=0A= mvn in1, in1=0A= @@ -421,10 +508,10 @@ ENTRY (libat_fetch_nand_16_i1)=0A= stlxp w4, tmp0, tmp1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_fetch_nand_16_i1)=0A= +END (libat_fetch_nand_16)=0A= =0A= =0A= -ENTRY (libat_nand_fetch_16_i1)=0A= +ENTRY (libat_nand_fetch_16)=0A= mov x5, x0=0A= mvn in0, in0=0A= mvn in1, in1=0A= @@ -445,21 +532,38 @@ ENTRY (libat_nand_fetch_16_i1)=0A= stlxp w4, res0, res1, [x5]=0A= cbnz w4, 2b=0A= ret=0A= -END (libat_nand_fetch_16_i1)=0A= +END (libat_nand_fetch_16)=0A= =0A= =0A= -ENTRY (libat_test_and_set_16_i1)=0A= - mov w2, 1=0A= - cbnz w1, 2f=0A= -=0A= - /* RELAXED. */=0A= - swpb w0, w2, [x0]=0A= - ret=0A= +/* __atomic_test_and_set is always inlined, so this entry is unused and=0A= + only required for completeness. */=0A= +ENTRY (libat_test_and_set_16)=0A= =0A= - /* ACQUIRE/CONSUME/RELEASE/ACQ_REL/SEQ_CST. */=0A= -2: swpalb w0, w2, [x0]=0A= + /* RELAXED/ACQUIRE/CONSUME/RELEASE/ACQ_REL/SEQ_CST. */=0A= + mov x5, x0=0A= +1: ldaxrb w0, [x5]=0A= + stlxrb w4, w2, [x5]=0A= + cbnz w4, 1b=0A= ret=0A= -END (libat_test_and_set_16_i1)=0A= +END (libat_test_and_set_16)=0A= +=0A= +=0A= +/* Alias entry points which are the same in baseline and LSE2. */=0A= +=0A= +ALIAS (libat_exchange_16_i1, libat_exchange_16)=0A= +ALIAS (libat_fetch_add_16_i1, libat_fetch_add_16)=0A= +ALIAS (libat_add_fetch_16_i1, libat_add_fetch_16)=0A= +ALIAS (libat_fetch_sub_16_i1, libat_fetch_sub_16)=0A= +ALIAS (libat_sub_fetch_16_i1, libat_sub_fetch_16)=0A= +ALIAS (libat_fetch_or_16_i1, libat_fetch_or_16)=0A= +ALIAS (libat_or_fetch_16_i1, libat_or_fetch_16)=0A= +ALIAS (libat_fetch_and_16_i1, libat_fetch_and_16)=0A= +ALIAS (libat_and_fetch_16_i1, libat_and_fetch_16)=0A= +ALIAS (libat_fetch_xor_16_i1, libat_fetch_xor_16)=0A= +ALIAS (libat_xor_fetch_16_i1, libat_xor_fetch_16)=0A= +ALIAS (libat_fetch_nand_16_i1, libat_fetch_nand_16)=0A= +ALIAS (libat_nand_fetch_16_i1, libat_nand_fetch_16)=0A= +ALIAS (libat_test_and_set_16_i1, libat_test_and_set_16)=0A= =0A= =0A= /* GNU_PROPERTY_AARCH64_* macros from elf.h for use in asm code. */=0A= diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/confi= g/linux/aarch64/host-config.h=0A= index bea26825b4f75bb8ff348ab4b5fc45f4a5bd561e..851c78c01cd643318aaa52929ce= 4550266238b79 100644=0A= --- a/libatomic/config/linux/aarch64/host-config.h=0A= +++ b/libatomic/config/linux/aarch64/host-config.h=0A= @@ -35,10 +35,19 @@=0A= #endif=0A= #define IFUNC_NCOND(N) (1)=0A= =0A= -#if N =3D=3D 16 && IFUNC_ALT !=3D 0=0A= +#endif /* HAVE_IFUNC */=0A= +=0A= +/* All 128-bit atomic functions are defined in aarch64/atomic_16.S. */=0A= +#if N =3D=3D 16=0A= # define DONE 1=0A= #endif=0A= =0A= -#endif /* HAVE_IFUNC */=0A= +/* State we have lock-free 128-bit atomics. */=0A= +#undef FAST_ATOMIC_LDST_16=0A= +#define FAST_ATOMIC_LDST_16 1=0A= +#undef MAYBE_HAVE_ATOMIC_CAS_16=0A= +#define MAYBE_HAVE_ATOMIC_CAS_16 1=0A= +#undef MAYBE_HAVE_ATOMIC_EXCHANGE_16=0A= +#define MAYBE_HAVE_ATOMIC_EXCHANGE_16 1=0A= =0A= #include_next =0A= =0A=