From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2058.outbound.protection.outlook.com [40.107.7.58]) by sourceware.org (Postfix) with ESMTPS id F40073830B69 for ; Tue, 7 Nov 2023 10:24:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F40073830B69 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F40073830B69 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.7.58 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699352702; cv=pass; b=Of5EWm+hlLiQtcdFk+aByTkdazARv9DjK143/Sul0XwAvUTSySxaFHYL+NmWnkQ3DVhkGBiuhu3Q4r6wVUP4+AssbCFWEEBALTeSEJ0hBNvtZkA1lYsBwIg2q/m3ggtwHPQUP3bNbz5mdM0CJa9UiJ71ckXUfnkZG0nbv/xV1yA= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699352702; c=relaxed/simple; bh=h3725qzE4PNrZ6erisUrlVma253ou9qtBmIAAIznO+c=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=NUHqpK+wEuIABeOAPee3TzmcHHXrd/R0oaPuctXI+YDBw+g1gP/kgn0OObINPt7GvYctKfzRSYrm2JtWfVreq5mRpeVs5Dq8dX9f65BqAoMa9TRdYP9Bx1rchDLd0FPF28nFC2aV9lT/KmcJ0sKg8qwghfgduc28OH/2t9p/vnk= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=B84IbMMQIP5M5ygZsRROjmEOVEmigUB+Vub6P9FEFYvlG8SIUX4t+ewmkxPuUFf/kdB8tTLVYzIDeu00GIjEHzz0i5HnJ3TTnW2gUvOi7PbBqPYZgcrfK52W7BsGyZiZsR5/xKXGbBwpU69iJpYjUsNaqxrzjcd50PJGkE8ia9WsfENeI8ge6I9rQcQdtTgp3CNfcKM3/QR6s/KGHEPGlRiTsN3/gUr5q4+8Tx4/YwWseb8m3k+6ATMXugco4GYo0b6YhtcZMaZMRagf9xiVjujilY4WlUfwUkbGDoEcuOhPaaOFLNm8KkurN1lgXpvc7QeR2wmCaz6IOiyM1Kr/cA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/9OXIqq37QFYvq4d1YmSvtHQqFrQijOJm9I4EFo8xmE=; b=Dsri5JBfDouoj7hROOfBYvbWC72PDCEuP/ilUv7YlaTKfx7QZ6gs0EF3kVRPunx6qwj7HmNeyr6fybRM7GSAIkiJkM34DRbhIXITdY1dblfevvi5P+H8FHViYmXtMLuc0wb7DtNHIzo7knM2odJ1zkOmQNBhfz4qgXSjxNL9BdC9Lv9lsIBuO6Y55DcQPE03XnVyzd6L8jvTA+H4qFLaD/mzDr9HWa0ejqd/yxCP7dGIHIqJH/rC+eq5zyZT2fL+0/PD1O/6KPpvlc2WLDmRbajrt4mb11hbE7FgFC9YE3Q0ABqFx70WOe0DXmFh0xyhQcMrxQAkTYoyWIGCRDWx0Q== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/9OXIqq37QFYvq4d1YmSvtHQqFrQijOJm9I4EFo8xmE=; b=HuwSQ/v+0aFIUz/oaE3H6zFdYAQnf5b0ioSJoMPd7J+o6zBPWWg1UPM7hCr2xE2RTlf3VCdqBllICb3f1UDxrrCK+Z1d70BiEZ8Bcwzdoar1Z6gCRGWZajaIqxaI2kHe+46ZqwMEwYl3Aq/huz+2xOkCe1L+HWPqRJzqvhWngNY= Received: from AM0PR03CA0057.eurprd03.prod.outlook.com (2603:10a6:208::34) by GV2PR08MB8294.eurprd08.prod.outlook.com (2603:10a6:150:be::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.28; Tue, 7 Nov 2023 10:24:54 +0000 Received: from AM4PEPF00027A6A.eurprd04.prod.outlook.com (2603:10a6:208:0:cafe::e) by AM0PR03CA0057.outlook.office365.com (2603:10a6:208::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.28 via Frontend Transport; Tue, 7 Nov 2023 10:24:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00027A6A.mail.protection.outlook.com (10.167.16.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.16 via Frontend Transport; Tue, 7 Nov 2023 10:24:53 +0000 Received: ("Tessian outbound 385ad2f98d71:v228"); Tue, 07 Nov 2023 10:24:53 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 4d234756d3a7c025 X-CR-MTA-TID: 64aa7808 Received: from 89b55cb1b023.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 7B6F57B2-19D0-47A6-AD9F-3E870A12E8A0.1; Tue, 07 Nov 2023 10:24:46 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 89b55cb1b023.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 07 Nov 2023 10:24:46 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CGLMKHk80XPiiE9KF0BC2QAMknw6d3YMHcffJTh5FOvM5JCSD9eso8rDi1AOEYmAMzDtGUvrrZzhFQ3yDnkYwElq2b/IUVipoSHr7XMoHMyEyL7K1oqqpIeQFkiMwO4JnsmNLCFU0JxRn0P7mzmNzeOYchgmCqCm/RxTMUezCB5E9F0l1ugb6Qo2T1sspo3dQKIHnS9iUoH5EhOl/Pg+izrFhdsLyA6BsgUqem2accwIIjLZEqYZmSpOzFBeK8GpxnKwi0e2o0FJCc2s0YCp2KHmJmGu9rdYJ0A9EmyXIZMDbo/Ti8Zd/6wEO+VfHO1nOKmo0YihBp+6SzxBHtjycg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/9OXIqq37QFYvq4d1YmSvtHQqFrQijOJm9I4EFo8xmE=; b=W8dWBFN3GpT1lvl+Baxpe1pIDn8IG+m3g7dZ1Qx2oVVcrDPLlP4zeckQHORYHpTde9R/HsyPfBQTr6Fif3vtWvrhPdH/Znd/mjtt1hc2e6qOUs+cg3cW0PFgOoxr/q8qZ5zg/QU7QNtWszucRjta+nC89pCPHOYC9GRuHdEvc1WkjIaJAFzTv+sIgMUiro0w2DkzkJbxK8p6wOB6ce0LuXETm8PhqlORYE2YzpciLDzCSvCqMtG9WU23TPBSqUhpLWLxwxelHQx3SlwgCjHNmlFGGetAAiGxmIcgAXuy6YsyRidEBm8+KlSXXQxgxqz9UB8toUOyVlvODsPxzHXzCw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/9OXIqq37QFYvq4d1YmSvtHQqFrQijOJm9I4EFo8xmE=; b=HuwSQ/v+0aFIUz/oaE3H6zFdYAQnf5b0ioSJoMPd7J+o6zBPWWg1UPM7hCr2xE2RTlf3VCdqBllICb3f1UDxrrCK+Z1d70BiEZ8Bcwzdoar1Z6gCRGWZajaIqxaI2kHe+46ZqwMEwYl3Aq/huz+2xOkCe1L+HWPqRJzqvhWngNY= Received: from AS9P194CA0004.EURP194.PROD.OUTLOOK.COM (2603:10a6:20b:46d::32) by VI1PR08MB10049.eurprd08.prod.outlook.com (2603:10a6:800:1c5::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.28; Tue, 7 Nov 2023 10:24:42 +0000 Received: from AM4PEPF00027A69.eurprd04.prod.outlook.com (2603:10a6:20b:46d:cafe::3b) by AS9P194CA0004.outlook.office365.com (2603:10a6:20b:46d::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.29 via Frontend Transport; Tue, 7 Nov 2023 10:24:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM4PEPF00027A69.mail.protection.outlook.com (10.167.16.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6977.16 via Frontend Transport; Tue, 7 Nov 2023 10:24:41 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Tue, 7 Nov 2023 10:24:39 +0000 Received: from e125768.cambridge.arm.com (10.2.78.50) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.32 via Frontend Transport; Tue, 7 Nov 2023 10:24:39 +0000 From: Victor Do Nascimento To: CC: , , , Victor Do Nascimento Subject: [PATCH 2/2] libatomic: Enable LSE128 128-bit atomics for armv9.4-a Date: Tue, 7 Nov 2023 10:23:56 +0000 Message-ID: <20231107102424.2836255-3-victor.donascimento@arm.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231107102424.2836255-1-victor.donascimento@arm.com> References: <20231107102424.2836255-1-victor.donascimento@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM4PEPF00027A69:EE_|VI1PR08MB10049:EE_|AM4PEPF00027A6A:EE_|GV2PR08MB8294:EE_ X-MS-Office365-Filtering-Correlation-Id: 91942a0e-c4fd-439d-0b79-08dbdf7bc853 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: WUb037GNiBP1jbd7XI8KBzon3X3l4YFWdw5aKUFyDLza4OfucV6kQa73ETVxiKBBE0vhH0NRkqudFZvpdHzQCpsxOrXAlNasIHPCR0lc6QvpR3VZXzKeUCzw9vKgZMiSPGakV2oGqzPmVZ50CIf1uB2t01vLHiSyXTd5M6PBDtaG2JLCl6WaWEWzSJbsDQyPk32KzcI0qr0MSyUPvq0YR/V7GJZ1oGEamwEIeytrT9bGOetWWek5MJlKCL21FUY8loTNBunWvzb6b/Iw8BGU9aiOkPusCelqcLE2FrDT1Ec8zrl2G4+ssDqMHG6tvuZBlJRLS0WjBAbB0jrF9pyjiKUgF6j5EmG3WFf/8xPy9Y86ytVtV5qgnUiMFiA/ONTSneGmgnQzZuiMxHefFCxSogIYFxBQY6VTZqqqibPMqbXVJIRbRdJL3ltu8+bunlKd5D34yhv2kK50zIFpbZnO0LAiOIO/umDvBXppyahik5kVfJsoxlhS6KdmYHEMZ5HJU8assgWJSldmpAXM6O68iuj6R+LCWUI8+cbZTtoU315cS4JZX8qvTB0l6GbUtkddSaECwlnrLEfb5pMWfK0KmH44Xu2Zv6s2VaDMCMkogEGcyu219Kg2at5XAJLUVpApXYvZi2xyScQdY0tJvehNq1dsDcbQBzQ092MyTSHC3bxzc/h4GpiO3CK/NOj4MPuWbWgMf9ZjoLvSRMa1LtENEY4Biq0N9+AFXn72EKCWmL3KLpgzGpeqK0hgcULrevYNiGhTIuXZs1wdfXAWE9Uo7406tDTsLt2dQKnpVJQ/dyQ= X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:nebula.arm.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(136003)(396003)(346002)(39850400004)(376002)(230173577357003)(230273577357003)(230922051799003)(451199024)(82310400011)(186009)(1800799009)(64100799003)(46966006)(36840700001)(2906002)(478600001)(7696005)(6666004)(2616005)(83380400001)(1076003)(70206006)(26005)(336012)(426003)(41300700001)(54906003)(8936002)(6916009)(8676002)(4326008)(70586007)(316002)(86362001)(36860700001)(47076005)(36756003)(356005)(81166007)(82740400003)(30864003)(5660300002)(40480700001)(36900700001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB10049 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00027A6A.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: d2a8a572-c931-4cf0-38b6-08dbdf7bc148 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xfiU4IA88V93WZazhesGOFvjlOO5Rd1xg7kZDvOh65u1lUeP73ayHfcd2eb+OfsDxdDizV6CYye8Drwor4+9dNzybivEimJ8wms91UNq3azuBY5Tul7zVR09/zy3y6Ql+2IKxYcOrNCaTw3mxhJsz44TF7JDb/CQ2fTcQ+hQsY8kRR4IiBBI/h/+MbQaZPmXl2o/+iT+l3d2S8RgSYei0pJ+kCu5m0tTHAWmyVAybgWZAUBtZxylIbAtw+fh0Mw36ECfxYqHmjB06Ff7YtHEHz6g+alqovK0QQIZ2Og779k9spyhMcqQwXIZeH8Z8et4XKu0a/8wR6XrXmgs4ryNdH7ZrLzGCYn6tYevkG0fvkJocnWpjufaLZwaRqFPyE0bhO8M0/JN57UwWuqT6FT+JPl8tw3aN33gKc9JYHb3aWuQ9vK74B2sJa2zOvmv9hGA0V1Uv20zx4wF3hxeotLFD0XsJj3rAXzjcFdomc5yVk18srqVgCYdApJKfq1HpMMI4pKYmh7rC9UttYXN8Ux8zW0sfLjwYVUzTCLG//aiMG9re7JRs3z/Xv705Q6+t9ZSQ45Bawl0TkvdlszKYLB9hABP1K34jRqIATjpDyN9kekHhAcHF5n9YoKtjBZbaxzt3KSUWLQuOyBySvUh45eZCOGSikNokSWrwhOYRxx2piiWOor3OV3KuVkNtWEGktKOztqSWCKTVWZjkOQeHHHK3MuYZVrPZ659SajmSjY4rZ5BA1fvGT2Dvy/v5M3BxxVKYW6ph2ubW0wwziNwTDUK+iNVCRNQKhDXFuj1eMJQdBp5cIwvSOk1IHskM0h0fiZVoJohATj5Nz8BCwH88eqLtw== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(136003)(376002)(396003)(346002)(39860400002)(230173577357003)(230922051799003)(230273577357003)(451199024)(186009)(82310400011)(64100799003)(1800799009)(40470700004)(36840700001)(46966006)(40460700003)(426003)(336012)(83380400001)(82740400003)(26005)(2616005)(6666004)(81166007)(70586007)(316002)(478600001)(6916009)(4326008)(70206006)(8936002)(8676002)(36756003)(54906003)(36860700001)(30864003)(2906002)(7696005)(41300700001)(86362001)(5660300002)(1076003)(40480700001)(47076005);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Nov 2023 10:24:53.7289 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 91942a0e-c4fd-439d-0b79-08dbdf7bc853 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00027A6A.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB8294 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SCC_5_SHORT_WORD_LINES,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The armv9.4-a architectural revision adds three new atomic operations associated with the LSE128 feature: * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit value held in a pair of registers, with original data loaded into the same 2 registers. * LDSETP - Atomic OR (bitset) of a location with 128-bit value held in a pair of registers, with original data loaded into the same 2 registers. * SWPP - Atomic swap of one 128-bit value with 128-bit value held in a pair of registers. This patch adds the logic required to make use of these when the architectural feature is present and a suitable assembler available. In order to do this, the following changes are made: 1. Add a configure-time check to check for LSE128 support in the assembler. 2. Edit host-config.h so that when N == 16, nifunc = 2. 3. Where available due to LSE128, implement the second ifunc, making use of the novel instructions. 4. For atomic functions unable to make use of these new instructions, define a new alias which causes the _i1 function variant to point ahead to the corresponding _i2 implementation. libatomic/ChangeLog: * Makefile.am (AM_CPPFLAGS): add conditional setting of -DHAVE_FEAT_LSE128. * acinclude.m4 (LIBAT_TEST_FEAT_LSE128): New. * config/linux/aarch64/atomic_16.S (LSE128): New macro definition. (libat_exchange_16): New LSE128 variant. (libat_fetch_or_16): Likewise. (libat_or_fetch_16): Likewise. (libat_fetch_and_16): Likewise. (libat_and_fetch_16): Likewise. * config/linux/aarch64/host-config.h (IFUNC_COND_2): New. (IFUNC_NCOND): Add operand size checking. (has_lse2): Renamed from `ifunc1`. (has_lse128): New. (HAS_LSE128): Likewise. * libatomic/configure.ac: Add call to LIBAT_TEST_FEAT_LSE128. * configure (ac_subst_vars): Regenerated via autoreconf. * libatomic/Makefile.in: Likewise. * libatomic/auto-config.h.in: Likewise. --- libatomic/Makefile.am | 3 + libatomic/Makefile.in | 1 + libatomic/acinclude.m4 | 19 +++ libatomic/auto-config.h.in | 3 + libatomic/config/linux/aarch64/atomic_16.S | 170 ++++++++++++++++++- libatomic/config/linux/aarch64/host-config.h | 23 ++- libatomic/configure | 59 ++++++- libatomic/configure.ac | 1 + 8 files changed, 271 insertions(+), 8 deletions(-) diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am index c0b8dea5037..24e843db67d 100644 --- a/libatomic/Makefile.am +++ b/libatomic/Makefile.am @@ -130,6 +130,9 @@ libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix _$(s)_.lo,$(SIZEOBJS))) ## On a target-specific basis, include alternates to be selected by IFUNC. if HAVE_IFUNC if ARCH_AARCH64_LINUX +if ARCH_AARCH64_HAVE_LSE128 +AM_CPPFLAGS = -DHAVE_FEAT_LSE128 +endif IFUNC_OPTIONS = -march=armv8-a+lse libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix _$(s)_1_.lo,$(SIZEOBJS))) libatomic_la_SOURCES += atomic_16.S diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in index dc2330b91fd..cd48fa21334 100644 --- a/libatomic/Makefile.in +++ b/libatomic/Makefile.in @@ -452,6 +452,7 @@ M_SRC = $(firstword $(filter %/$(M_FILE), $(all_c_files))) libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix \ _$(s)_.lo,$(SIZEOBJS))) $(am__append_1) $(am__append_3) \ $(am__append_4) $(am__append_5) +@ARCH_AARCH64_HAVE_LSE128_TRUE@@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@AM_CPPFLAGS = -DHAVE_FEAT_LSE128 @ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv8-a+lse @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a+fp -DHAVE_KERNEL64 @ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=i586 diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4 index f35ab5b60a5..4197db8f404 100644 --- a/libatomic/acinclude.m4 +++ b/libatomic/acinclude.m4 @@ -83,6 +83,25 @@ AC_DEFUN([LIBAT_TEST_ATOMIC_BUILTIN],[ ]) ]) +dnl +dnl Test if the host assembler supports armv9.4-a LSE128 isns. +dnl +AC_DEFUN([LIBAT_TEST_FEAT_LSE128],[ + AC_CACHE_CHECK([for armv9.4-a LSE128 insn support], + [libat_cv_have_feat_lse128],[ + AC_LANG_CONFTEST([AC_LANG_PROGRAM([],[asm(".arch armv9-a+lse128")])]) + if AC_TRY_EVAL(ac_link); then + eval libat_cv_have_feat_lse128=yes + else + eval libat_cv_have_feat_lse128=no + fi + rm -f conftest* + ]) + LIBAT_DEFINE_YESNO([HAVE_FEAT_LSE128], [$libat_cv_have_feat_lse128], + [Have LSE128 support for 16 byte integers.]) + AM_CONDITIONAL([ARCH_AARCH64_HAVE_LSE128], [test x$libat_cv_have_feat_lse128 = xyes]) +]) + dnl dnl Test if we have __atomic_load and __atomic_store for mode $1, size $2 dnl diff --git a/libatomic/auto-config.h.in b/libatomic/auto-config.h.in index ab3424a759e..7c78933b07d 100644 --- a/libatomic/auto-config.h.in +++ b/libatomic/auto-config.h.in @@ -105,6 +105,9 @@ /* Define to 1 if you have the header file. */ #undef HAVE_DLFCN_H +/* Have LSE128 support for 16 byte integers. */ +#undef HAVE_FEAT_LSE128 + /* Define to 1 if you have the header file. */ #undef HAVE_FENV_H diff --git a/libatomic/config/linux/aarch64/atomic_16.S b/libatomic/config/linux/aarch64/atomic_16.S index 3f6225830e6..44a773031f8 100644 --- a/libatomic/config/linux/aarch64/atomic_16.S +++ b/libatomic/config/linux/aarch64/atomic_16.S @@ -34,10 +34,14 @@ writes, this will be true when using atomics in actual code. The libat__16 entry points are ARMv8.0. - The libat__16_i1 entry points are used when LSE2 is available. */ - + The libat__16_i1 entry points are used when LSE128 is available. + The libat__16_i2 entry points are used when LSE2 is available. */ +#if HAVE_FEAT_LSE128 + .arch armv8-a+lse128 +#else .arch armv8-a+lse +#endif #define ENTRY(name, feat) \ ENTRY1(name, feat) @@ -66,7 +70,8 @@ name##feat: \ .set alias##from, alias##to; #define CORE -#define LSE2 _i1 +#define LSE128 _i1 +#define LSE2 _i2 #define res0 x0 #define res1 x1 @@ -201,6 +206,31 @@ ENTRY (libat_exchange_16, CORE) END (libat_exchange_16, CORE) +#if HAVE_FEAT_LSE128 +ENTRY (libat_exchange_16, LSE128) + mov tmp0, x0 + mov res0, in0 + mov res1, in1 + cbnz w4, 1f + + /* RELAXED. */ + swpp res0, res1, [tmp0] + ret +1: + cmp w4, ACQUIRE + b.hi 2f + + /* ACQUIRE/CONSUME. */ + swppa res0, res1, [tmp0] + ret + + /* RELEASE/ACQ_REL/SEQ_CST. */ +2: swppal res0, res1, [tmp0] + ret +END (libat_exchange_16, LSE128) +#endif + + ENTRY (libat_compare_exchange_16, CORE) ldp exp0, exp1, [x1] cbz w4, 3f @@ -389,6 +419,31 @@ ENTRY (libat_fetch_or_16, CORE) END (libat_fetch_or_16, CORE) +#if HAVE_FEAT_LSE128 +ENTRY (libat_fetch_or_16, LSE128) + mov tmp0, x0 + mov res0, in0 + mov res1, in1 + cbnz w4, 1f + + /* RELAXED. */ + ldsetp res0, res1, [tmp0] + ret +1: + cmp w4, ACQUIRE + b.hi 2f + + /* ACQUIRE/CONSUME. */ + ldsetpa res0, res1, [tmp0] + ret + + /* RELEASE/ACQ_REL/SEQ_CST. */ +2: ldsetpal res0, res1, [tmp0] + ret +END (libat_fetch_or_16, LSE128) +#endif + + ENTRY (libat_or_fetch_16, CORE) mov x5, x0 cbnz w4, 2f @@ -411,6 +466,36 @@ ENTRY (libat_or_fetch_16, CORE) END (libat_or_fetch_16, CORE) +#if HAVE_FEAT_LSE128 +ENTRY (libat_or_fetch_16, LSE128) + cbnz w4, 1f + mov tmp0, in0 + mov tmp1, in1 + + /* RELAXED. */ + ldsetp in0, in1, [x0] + orr res0, in0, tmp0 + orr res1, in1, tmp1 + ret +1: + cmp w4, ACQUIRE + b.hi 2f + + /* ACQUIRE/CONSUME. */ + ldsetpa in0, in1, [x0] + orr res0, in0, tmp0 + orr res1, in1, tmp1 + ret + + /* RELEASE/ACQ_REL/SEQ_CST. */ +2: ldsetpal in0, in1, [x0] + orr res0, in0, tmp0 + orr res1, in1, tmp1 + ret +END (libat_or_fetch_16, LSE128) +#endif + + ENTRY (libat_fetch_and_16, CORE) mov x5, x0 cbnz w4, 2f @@ -433,6 +518,32 @@ ENTRY (libat_fetch_and_16, CORE) END (libat_fetch_and_16, CORE) +#if HAVE_FEAT_LSE128 +ENTRY (libat_fetch_and_16, LSE128) + mov tmp0, x0 + mvn res0, in0 + mvn res1, in1 + cbnz w4, 1f + + /* RELAXED. */ + ldclrp res0, res1, [tmp0] + ret + +1: + cmp w4, ACQUIRE + b.hi 2f + + /* ACQUIRE/CONSUME. */ + ldclrpa res0, res1, [tmp0] + ret + + /* RELEASE/ACQ_REL/SEQ_CST. */ +2: ldclrpal res0, res1, [tmp0] + ret +END (libat_fetch_and_16, LSE128) +#endif + + ENTRY (libat_and_fetch_16, CORE) mov x5, x0 cbnz w4, 2f @@ -455,6 +566,37 @@ ENTRY (libat_and_fetch_16, CORE) END (libat_and_fetch_16, CORE) +#if HAVE_FEAT_LSE128 +ENTRY (libat_and_fetch_16, LSE128) + mvn tmp0, in0 + mvn tmp0, in1 + cbnz w4, 1f + + /* RELAXED. */ + ldclrp tmp0, tmp1, [x0] + and res0, tmp0, in0 + and res1, tmp1, in1 + ret + +1: + cmp w4, ACQUIRE + b.hi 2f + + /* ACQUIRE/CONSUME. */ + ldclrpa tmp0, tmp1, [x0] + and res0, tmp0, in0 + and res1, tmp1, in1 + ret + + /* RELEASE/ACQ_REL/SEQ_CST. */ +2: ldclrpal tmp0, tmp1, [x5] + and res0, tmp0, in0 + and res1, tmp1, in1 + ret +END (libat_and_fetch_16, LSE128) +#endif + + ENTRY (libat_fetch_xor_16, CORE) mov x5, x0 cbnz w4, 2f @@ -560,6 +702,28 @@ ENTRY (libat_test_and_set_16, CORE) END (libat_test_and_set_16, CORE) +/* Alias entry points which are the same in LSE2 and LSE128. */ + +#if !HAVE_FEAT_LSE128 +ALIAS (libat_exchange_16, LSE128, LSE2) +ALIAS (libat_fetch_or_16, LSE128, LSE2) +ALIAS (libat_fetch_and_16, LSE128, LSE2) +ALIAS (libat_or_fetch_16, LSE128, LSE2) +ALIAS (libat_and_fetch_16, LSE128, LSE2) +#endif +ALIAS (libat_load_16, LSE128, LSE2) +ALIAS (libat_store_16, LSE128, LSE2) +ALIAS (libat_compare_exchange_16, LSE128, LSE2) +ALIAS (libat_fetch_add_16, LSE128, LSE2) +ALIAS (libat_add_fetch_16, LSE128, LSE2) +ALIAS (libat_fetch_sub_16, LSE128, LSE2) +ALIAS (libat_sub_fetch_16, LSE128, LSE2) +ALIAS (libat_fetch_xor_16, LSE128, LSE2) +ALIAS (libat_xor_fetch_16, LSE128, LSE2) +ALIAS (libat_fetch_nand_16, LSE128, LSE2) +ALIAS (libat_nand_fetch_16, LSE128, LSE2) +ALIAS (libat_test_and_set_16, LSE128, LSE2) + /* Alias entry points which are the same in baseline and LSE2. */ ALIAS (libat_exchange_16, LSE2, CORE) diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h index 30ef21c7715..6e471f9e400 100644 --- a/libatomic/config/linux/aarch64/host-config.h +++ b/libatomic/config/linux/aarch64/host-config.h @@ -26,14 +26,15 @@ #ifdef HWCAP_USCAT # if N == 16 -# define IFUNC_COND_1 (ifunc1 (hwcap)) +# define IFUNC_COND_1 (has_lse128 (hwcap)) +# define IFUNC_COND_2 (has_lse2 (hwcap)) # else # define IFUNC_COND_1 (hwcap & HWCAP_ATOMICS) # endif #else # define IFUNC_COND_1 (false) #endif -#define IFUNC_NCOND(N) (1) +#define IFUNC_NCOND(N) (N == 16 ? 2 : 1) #endif /* HAVE_IFUNC */ @@ -56,7 +57,7 @@ #define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff) static inline bool -ifunc1 (unsigned long hwcap) +has_lse2 (unsigned long hwcap) { if (hwcap & HWCAP_USCAT) return true; @@ -69,6 +70,22 @@ ifunc1 (unsigned long hwcap) return true; return false; } + +/* LSE128 atomic support encoded in ID_AA64ISAR0_EL1.Atomic, + bits [23:20]. The expected value is 0b0011. Check that. */ +#define HAS_LSE128() ({ \ + unsigned long __val; \ + asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (__val)); \ + (__val & 0x300000) == 0x300000; \ + }) + +static inline bool +has_lse128 (unsigned long hwcap) +{ + if (has_lse2 (hwcap) && HAS_LSE128 ()) + return true; + return false; +} #endif #include_next diff --git a/libatomic/configure b/libatomic/configure index d579bab96f8..ee3bbb97d69 100755 --- a/libatomic/configure +++ b/libatomic/configure @@ -657,6 +657,8 @@ LIBAT_BUILD_VERSIONED_SHLIB_TRUE OPT_LDFLAGS SECTION_LDFLAGS SYSROOT_CFLAGS_FOR_TARGET +ARCH_AARCH64_HAVE_LSE128_FALSE +ARCH_AARCH64_HAVE_LSE128_TRUE enable_aarch64_lse libtool_VERSION ENABLE_DARWIN_AT_RPATH_FALSE @@ -11456,7 +11458,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11459 "configure" +#line 11461 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -11562,7 +11564,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11565 "configure" +#line 11567 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -11926,6 +11928,55 @@ ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $ ac_compiler_gnu=$ac_cv_c_compiler_gnu + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for armv9.4-a LSE128 insn support" >&5 +$as_echo_n "checking for armv9.4-a LSE128 insn support... " >&6; } +if ${libat_cv_have_feat_lse128+:} false; then : + $as_echo_n "(cached) " >&6 +else + + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ +asm(".arch armv9-a+lse128") + ; + return 0; +} +_ACEOF + if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_link\""; } >&5 + (eval $ac_link) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + eval libat_cv_have_feat_lse128=yes + else + eval libat_cv_have_feat_lse128=no + fi + rm -f conftest* + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libat_cv_have_feat_lse128" >&5 +$as_echo "$libat_cv_have_feat_lse128" >&6; } + + yesno=`echo $libat_cv_have_feat_lse128 | tr 'yesno' '1 0 '` + +cat >>confdefs.h <<_ACEOF +#define HAVE_FEAT_LSE128 $yesno +_ACEOF + + + if test x$libat_cv_have_feat_lse128 = xyes; then + ARCH_AARCH64_HAVE_LSE128_TRUE= + ARCH_AARCH64_HAVE_LSE128_FALSE='#' +else + ARCH_AARCH64_HAVE_LSE128_TRUE='#' + ARCH_AARCH64_HAVE_LSE128_FALSE= +fi + + ;; esac @@ -15989,6 +16040,10 @@ if test -z "${ENABLE_DARWIN_AT_RPATH_TRUE}" && test -z "${ENABLE_DARWIN_AT_RPATH as_fn_error $? "conditional \"ENABLE_DARWIN_AT_RPATH\" was never defined. Usually this means the macro was only invoked conditionally." "$LINENO" 5 fi +if test -z "${ARCH_AARCH64_HAVE_LSE128_TRUE}" && test -z "${ARCH_AARCH64_HAVE_LSE128_FALSE}"; then + as_fn_error $? "conditional \"ARCH_AARCH64_HAVE_LSE128\" was never defined. +Usually this means the macro was only invoked conditionally." "$LINENO" 5 +fi if test -z "${LIBAT_BUILD_VERSIONED_SHLIB_TRUE}" && test -z "${LIBAT_BUILD_VERSIONED_SHLIB_FALSE}"; then as_fn_error $? "conditional \"LIBAT_BUILD_VERSIONED_SHLIB\" was never defined. diff --git a/libatomic/configure.ac b/libatomic/configure.ac index 5f2821ac3f4..b2fe68d7d0f 100644 --- a/libatomic/configure.ac +++ b/libatomic/configure.ac @@ -169,6 +169,7 @@ AC_MSG_RESULT([$target_thread_file]) case "$target" in *aarch64*) ACX_PROG_CC_WARNING_OPTS([-march=armv8-a+lse],[enable_aarch64_lse]) + LIBAT_TEST_FEAT_LSE128() ;; esac -- 2.41.0