From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2042.outbound.protection.outlook.com [40.107.7.42]) by sourceware.org (Postfix) with ESMTPS id 7C5B83854146 for ; Mon, 15 May 2023 14:22:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7C5B83854146 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5Rz883pFxq1/PPhybpcYcYAQgPzjCgcfiVXsM8ssBCE=; b=Bzlk4AKLR9aQbaKZwgj8gG0RisMTNWRfdqffios/pwd7TWWsmx05S9L2meS2By4UgbIk8Olh0KtMr0wVE1i9D7Aflpykz3Ph7wwWU3439czMEMv60E+hk7E8rpBR8ohIsLBB0e+im0VJtWZMnuZoMUPcryacJacePUmAU7dXQcQ= Received: from AS4P251CA0019.EURP251.PROD.OUTLOOK.COM (2603:10a6:20b:5d3::7) by DB3PR08MB8940.eurprd08.prod.outlook.com (2603:10a6:10:431::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6387.30; Mon, 15 May 2023 14:22:52 +0000 Received: from AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:5d3:cafe::b) by AS4P251CA0019.outlook.office365.com (2603:10a6:20b:5d3::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6387.30 via Frontend Transport; Mon, 15 May 2023 14:22:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT020.mail.protection.outlook.com (100.127.140.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.14 via Frontend Transport; Mon, 15 May 2023 14:22:52 +0000 Received: ("Tessian outbound e13c2446394c:v136"); Mon, 15 May 2023 14:22:52 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: bc2d17ebd6746d34 X-CR-MTA-TID: 64aa7808 Received: from 0fadd25cf50f.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 26DFE092-0E02-4D1C-8E8A-AF08B9897DB7.1; Mon, 15 May 2023 14:22:39 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 0fadd25cf50f.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 15 May 2023 14:22:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YTbUkJGZWl3MBzgnJnEAMuLgZ4wYxY++BhYSxFuEEt1Sv0sXZEKbsuI01Zf0Cj7pSFaQe5WblMQ6MVdir69+0WF/jrydRvzyOWCrKamEBG9vJ/63V/GEtIbqUklWN+UdmYI6c6z4q4iQl3E/8d149IR7fwtzzuCjl5JGeFyZ+W495LTMwC41LQZbIE9vKFVCrW0gZ/PRGC6u8H46azV8LFOxQRUwGnSY0FxMquyE1Ibp0mOPiGjVbrYK95C16rh8r6xMO/rgXkc7Z+tniI5PDqpHZxNNmEXXcz+0i6HJQLTZyLlb/lTVDCKQ98aOV3xrHbDOUDOCFBDx0mg3jajpnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5Rz883pFxq1/PPhybpcYcYAQgPzjCgcfiVXsM8ssBCE=; b=PzTmRqYeIjRZWXoVooDk3qKeLpAl1pFPXsoxbRCU2aY5rF9aB4lrtGjRJSfkXZJVX/iO09qebRhbL6qiLr0CUV54P43tmE0TJYle8QnUbxfAPCg/r6HIXwkoD4l1LQfI1TVfZf0lY8t9f6LURJyH6W5byeRoD4m40P1UpZvskLJX4GTuJMNUYAaPlD9YRM6oYeBcvRWGC9665y2Uemz9Z9XvS7QR7kWpwcHAIVm+AikDkTmHL1Mrsqz8ToT/7ALgShrqEMNPHoc+aQEe6Rh3Hq0oYAMM5RoOROQvkQK5gqalOlGaZ5qQSaSBZGcraIJihkJVdM7mtBP4UAdr+aTR4Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5Rz883pFxq1/PPhybpcYcYAQgPzjCgcfiVXsM8ssBCE=; b=Bzlk4AKLR9aQbaKZwgj8gG0RisMTNWRfdqffios/pwd7TWWsmx05S9L2meS2By4UgbIk8Olh0KtMr0wVE1i9D7Aflpykz3Ph7wwWU3439czMEMv60E+hk7E8rpBR8ohIsLBB0e+im0VJtWZMnuZoMUPcryacJacePUmAU7dXQcQ= Received: from PAXPR08MB6926.eurprd08.prod.outlook.com (2603:10a6:102:138::24) by AS2PR08MB8693.eurprd08.prod.outlook.com (2603:10a6:20b:55c::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6387.30; Mon, 15 May 2023 14:22:33 +0000 Received: from PAXPR08MB6926.eurprd08.prod.outlook.com ([fe80::db73:66ba:ae70:1ff1]) by PAXPR08MB6926.eurprd08.prod.outlook.com ([fe80::db73:66ba:ae70:1ff1%3]) with mapi id 15.20.6387.030; Mon, 15 May 2023 14:22:33 +0000 From: Kyrylo Tkachov To: Richard Sandiford CC: "gcc-patches@gcc.gnu.org" Subject: RE: [PATCH 2/6] aarch64: Allow moves after tied-register intrinsics Thread-Topic: [PATCH 2/6] aarch64: Allow moves after tied-register intrinsics Thread-Index: AQHZgkKpU13axs51vkimrY3iaO0TSq9bZv1wgAAEx8WAAADQ8A== Date: Mon, 15 May 2023 14:22:33 +0000 Message-ID: References: <20230509064831.1651327-1-richard.sandiford@arm.com> <20230509064831.1651327-3-richard.sandiford@arm.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAXPR08MB6926:EE_|AS2PR08MB8693:EE_|AM7EUR03FT020:EE_|DB3PR08MB8940:EE_ X-MS-Office365-Filtering-Correlation-Id: 6832ce80-98c8-4333-8167-08db554fde6e x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 7FVQjWkFHgjuIFrq1+kk4+Vps5+E7i8qwHBw9ZfAklDZhdXMg5OEKOQevf/Q25MXETskdVUspXEn0J/TgoPFddUFTApmf+ANN4q07r0nrip3LcGYhV00aUp/3emiN43SIIfaQ920fpSCMfm4vt90KUUUBjxKniODWzoQmQ3IGQvODF3/yOwG+9EqE1QT+EYVq+Y7hg8AFTKOXSD79lW+l0zuNgytSzoZAsoLHHxuGMmf5QU/vImvZPH6GXlB9C9d5Tx0PX0DJbLYfN4u+qJgviilh0lol0e3Z9P8s2duNzMJ8xDYPOfDIIMv0GWQc2SWwsMxsjH5xn6LF1occOMA2I3OoApip2IRDbQMsfv56dSUiR9YF/++9AI1L2Z9868S/I3ir1fuRFXHJIunKfQBYU2smXy2y/HBI67Nuh5GWSJRgRvpEHAXcydv7eO1QfZp2lOl8Y+SiWn6jQPXFyDPiFoOHHNlvBCDhRz/lVU4TQ1aZLgBPtEXNxIcYGSQc20pnAOJbmldph/kAyzqVY50S2u4SkCaUAis9zrzjTlcLrk+WT3Bq/vedN77pvUN3ZW+O7ahns3yiFnfOn+khKxu7YEzMT+cjdQzdDr0itw2SWkXQNo31mtQJ+hpbfQ8C+XMyPsqwzkGvMUXblSRt2+d9Q== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAXPR08MB6926.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(136003)(346002)(396003)(366004)(376002)(39860400002)(451199021)(33656002)(86362001)(316002)(64756008)(66946007)(6636002)(66476007)(4326008)(66446008)(66556008)(478600001)(76116006)(7696005)(55016003)(5660300002)(52536014)(8676002)(8936002)(6862004)(2906002)(38070700005)(38100700002)(122000001)(41300700001)(186003)(53546011)(26005)(6506007)(9686003)(83380400001)(71200400001)(84970400001);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8693 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 46310d95-2bd3-4657-e7c5-08db554fd31a X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: +lJnQAPBt6ke48qW4bpiUMmdMSRbabjU27QEokUdI6aBCJAZCvqywdOEnyFojrrTK2N3Ihg16dvu52UYrY/UxKqyRvSi9I7/I4dd+KB7yAOkF/VuGXP6EKJjHXXGJ1+//vz2hUDGwxuo4BWFPr49LclpIr64FmPLp3E4loo68XCs6eYuSXZd+aQ5dqNa1285VV8ZeARPiqOu8hJu7Z8BP/OIrEIcyEJ7MlTHfRNTG6FKpFUZ+YGWQDan1oK3I7fMCYCbphpxuYLmRf3JW5tC0mih9higRdA3m2Eqg8B7oaN8CtcLzyBJwTbC4FIYGzDFqubU3xoHhGJ8jcLL02Pu8thkv3crLml/G+wJpRU+3l3c+LaQORDyasBQJ6567gnMYFsUwXDCmJByoGxUusedMfpCdNuKmpHckhEWhTs9uG2U8XR4RS5ekBWkDbdcxVVlVAO7SYp31wI1nFnSAdBee+9ObDbyI59iBKN0c3D1vLbw49wuk8jRrIgQNmdhvmSiJ3d9BCXQZQ5x3pBy14X8iRb6QJ5tteCXZIwPrm7LMIe5KSeamQKfFlFSxhdDvjzc78Lq1Ck0wrImIik0cgbbzeBxlL/QRz6wVM0jHMYzZCNMkaiAXxFnzqctcCgb8Dfv5SN+eKBlcJMfTkU2AZn2RRjXnLUMjL11c2eUPSPdZvrvuP2W0ceRybp3nOU4P2uIKY2WBaAUBXQ4JDrIjWetnowBVDhT9WvP9sqBo85CTXk1PxnYSlr1OOsGb0MzGKBAgMpBreYj1nfZ/dPxkQHVHQ== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(346002)(39850400004)(136003)(396003)(376002)(451199021)(36840700001)(40470700004)(46966006)(70586007)(36860700001)(41300700001)(84970400001)(70206006)(6636002)(4326008)(82740400003)(47076005)(356005)(33656002)(82310400005)(7696005)(26005)(6506007)(336012)(81166007)(316002)(5660300002)(52536014)(83380400001)(8936002)(6862004)(8676002)(2906002)(478600001)(86362001)(40460700003)(55016003)(9686003)(53546011)(186003)(40480700001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 May 2023 14:22:52.4993 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6832ce80-98c8-4333-8167-08db554fde6e X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR08MB8940 X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Richard Sandiford > Sent: Monday, May 15, 2023 3:18 PM > To: Kyrylo Tkachov > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH 2/6] aarch64: Allow moves after tied-register intrins= ics >=20 > Kyrylo Tkachov writes: > > Hi Richard, > > > >> -----Original Message----- > >> From: Gcc-patches >> bounces+kyrylo.tkachov=3Darm.com@gcc.gnu.org> On Behalf Of Richard > >> Sandiford via Gcc-patches > >> Sent: Tuesday, May 9, 2023 7:48 AM > >> To: gcc-patches@gcc.gnu.org > >> Cc: Richard Sandiford > >> Subject: [PATCH 2/6] aarch64: Allow moves after tied-register intrinsi= cs > >> > >> Some ACLE intrinsics map to instructions that tie the output > >> operand to an input operand. If all the operands are allocated > >> to different registers, and if MOVPRFX can't be used, we will need > >> a move either before the instruction or after it. Many tests only > >> matched the "before" case; this patch makes them accept the "after" > >> case too. > >> > >> gcc/testsuite/ > >> * gcc.target/aarch64/advsimd-intrinsics/bfcvtnq2-untied.c: Allow > >> moves to occur after the intrinsic instruction, rather than requ= iring > >> them to happen before. > >> * gcc.target/aarch64/advsimd-intrinsics/bfdot-1.c: Likewise. > >> * gcc.target/aarch64/advsimd-intrinsics/vdot-3-1.c: Likewise. > > > > I'm seeing some dot-product intrinsics failures: > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O1 check-fun= ction- > bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O1 check-fun= ction- > bodies ufooq_lane_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 check-fun= ction- > bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 check-fun= ction- > bodies ufooq_lane_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 -flto -fno-= use- > linker-plugin -flto-partition=3Dnone check-function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O2 -flto -fno-= use- > linker-plugin -flto-partition=3Dnone check-function-bodies ufooq_lane_u= ntied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O3 -g check- > function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -O3 -g check- > function-bodies ufooq_lane_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Og -g check- > function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Og -g check- > function-bodies ufooq_lane_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Os check-fun= ction- > bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c -Os check-fun= ction- > bodies ufooq_lane_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O1 check- > function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O1 check- > function-bodies ufooq_laneq_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 check- > function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 check- > function-bodies ufooq_laneq_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 -flto -fno= -use- > linker-plugin -flto-partition=3Dnone check-function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O2 -flto -fno= -use- > linker-plugin -flto-partition=3Dnone check-function-bodies > ufooq_laneq_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O3 -g check= - > function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -O3 -g check= - > function-bodies ufooq_laneq_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Og -g check= - > function-bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Og -g check= - > function-bodies ufooq_laneq_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Os check-fu= nction- > bodies ufoo_untied > > FAIL: gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c -Os check-fu= nction- > bodies ufooq_laneq_untied >=20 > Ugh. Big-endian. Hadn't thought about that being an issue. > Was testing natively on little-endian aarch64-linux-gnu and > didn't see these. FWIW this is on a little-endian aarch64-none-elf configuration. Maybe some defaults are different on bare-metal from Linux... >=20 > > From a quick inspection it looks like it's just an alternative regalloc= that > moves the mov + dot instructions around, similar to what you fixed in bfd= ot- > 2.c and vdot-3-2.c. > > I guess they need a similar adjustment? >=20 > Yeah, will fix. Thanks! Kyrill >=20 > Thanks, > Richard