From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on2087.outbound.protection.outlook.com [40.107.13.87]) by sourceware.org (Postfix) with ESMTPS id B802E3858D20 for ; Fri, 14 Apr 2023 09:21:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B802E3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DsACLZnKUgPbZ6Gfu16Mlm5fuWxhjmWm9L71v9OJbR8=; b=0SsH9CxccCNfztRYI7cqMAMPoMnPkS2iZ7MfDOnqqH0WiK1o1Jkl5PXMBadimnn9LB4eANlBRvenfENlCTqsdqpiSs3o0c/rIIRRnbTBRsetGiejhe9YLYUMVaf+yLl0e2jBulQurMQ+K+kLxFeDZfKaUbaEF1PjmiRnqriU964= Received: from DB8PR06CA0036.eurprd06.prod.outlook.com (2603:10a6:10:100::49) by DB9PR08MB9948.eurprd08.prod.outlook.com (2603:10a6:10:3d0::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.46; Fri, 14 Apr 2023 09:20:59 +0000 Received: from DBAEUR03FT016.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:100:cafe::9b) by DB8PR06CA0036.outlook.office365.com (2603:10a6:10:100::49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6298.33 via Frontend Transport; Fri, 14 Apr 2023 09:20:59 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT016.mail.protection.outlook.com (100.127.142.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6298.29 via Frontend Transport; Fri, 14 Apr 2023 09:20:58 +0000 Received: ("Tessian outbound 8b05220b4215:v136"); Fri, 14 Apr 2023 09:20:58 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 67a3f52a18aec87a X-CR-MTA-TID: 64aa7808 Received: from 9442aeedee9f.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 53EE429C-1B38-4716-960C-57DEDC9FD1AF.1; Fri, 14 Apr 2023 09:20:52 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 9442aeedee9f.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 14 Apr 2023 09:20:52 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=THa+jit4gklGhfuUarznJSWAHSz/IVtHebBbzjcEMwVhLvayXr2C2sqZu5PQtI7Tznq2yoEhKK4eWb/VJmNE7RKImiNCr1TtZ6k8Sa8UZon+myp2KWvMuUb0582viWSNuva8dhw3ZPpsG033UTeeyE0UK5ZkWjs3Aa7FJX+HBt2ybfxxg+UXzy/yHklCZCAUDOfOIgnrDmUlgPeTR1Pq8xl1XawkGCy8suNcoo0KFOCLG4CL+qFsyshTJ7b9Ow9sTwBkEzEQNnqEsHD/2UD/bHXLtpO1DmdDsUgQHCXWUEcQZvRVi5rJ+G578ztzesdM6vBJ+utDgI3L60er+lT0Sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=DsACLZnKUgPbZ6Gfu16Mlm5fuWxhjmWm9L71v9OJbR8=; b=j9VxgroqMpGF/4cOD6mngaJOScyhROGBEUJZs6yuju6pO92BAv9nEsgxzSDMDTpvD0XANplIipUExUnM3ALlhBZQYHdJ+vVOTys2LMloinOgsxK9zLcf97vwHsPGeFynD8odqptbX9DSDWbGD+X26oahj0aT9fV44Mdw/eqria/EeBuPVFjrCS0uQDgO2lPeSBUOFGVRDw4ham0YVwS2vHyAmf5H90X/ldSSSnBbb/4UXq129ivivyGOg3d6UXhj9kUQjQ4XE5upYz5uIkEL9+M/pH1Zs12BE8/r4cnCR+ieVaEKYPpxrH0/FFc+g+Z2oElLxpGbtVRi08Xf1EmhqQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DsACLZnKUgPbZ6Gfu16Mlm5fuWxhjmWm9L71v9OJbR8=; b=0SsH9CxccCNfztRYI7cqMAMPoMnPkS2iZ7MfDOnqqH0WiK1o1Jkl5PXMBadimnn9LB4eANlBRvenfENlCTqsdqpiSs3o0c/rIIRRnbTBRsetGiejhe9YLYUMVaf+yLl0e2jBulQurMQ+K+kLxFeDZfKaUbaEF1PjmiRnqriU964= Received: from PAXPR08MB6926.eurprd08.prod.outlook.com (2603:10a6:102:138::24) by DU0PR08MB9487.eurprd08.prod.outlook.com (2603:10a6:10:42c::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6298.30; Fri, 14 Apr 2023 09:20:49 +0000 Received: from PAXPR08MB6926.eurprd08.prod.outlook.com ([fe80::8bb3:2d4d:b99e:f10a]) by PAXPR08MB6926.eurprd08.prod.outlook.com ([fe80::8bb3:2d4d:b99e:f10a%5]) with mapi id 15.20.6298.030; Fri, 14 Apr 2023 09:20:49 +0000 From: Kyrylo Tkachov To: Philipp Tomsich , "gcc-patches@gcc.gnu.org" CC: Di Zhao Subject: RE: [PATCH] aarch64: disable LDP via tuning structure for -mcpu=ampere1 Thread-Topic: [PATCH] aarch64: disable LDP via tuning structure for -mcpu=ampere1 Thread-Index: AQHZbl7KXr7AAGzgQkipG7cNFVB7o68qhi+g Date: Fri, 14 Apr 2023 09:20:49 +0000 Message-ID: References: <20230413232157.1487389-1-philipp.tomsich@vrull.eu> In-Reply-To: <20230413232157.1487389-1-philipp.tomsich@vrull.eu> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAXPR08MB6926:EE_|DU0PR08MB9487:EE_|DBAEUR03FT016:EE_|DB9PR08MB9948:EE_ X-MS-Office365-Filtering-Correlation-Id: 8116e9a9-daae-4a65-79aa-08db3cc98f0c x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: LOGdF0iGD9QzpAxnTbiw7oIGGAPBMDjhYQ7wG/lgeXDO9N+a/gGhS9uwJ3kelwr1QSNpkcWwZqs0B/+HHNmmxF5GCESOgLE9QnrVDsccutEvP0O7zMlAYmKSQFGWoYgMK2VA1ck+HpamXYzsDvG6oYUtN5gbs8s6QVdYFiKVofXN5FVNokKiyEjs/tWeYOuq7VNWpKozPsTYik5yB4O2kv4TNZSyZwobHBvEurKMnoeYI62XC72p+2tjGTlkIky4EFZh5GnEbzaw6eWJ27gW8bW5SrASD4lvZ/ZgdwF85gMTnwqmGZAaNnSr3YPMuSMTZNj+v0k9jbgdvas1hZUQZ3dVpLo79d1CAtCfAhnt6N9cXHFwa0Ogwefm3gEKA/i5GZbt8tDAW2ETHIm4VXQY1ZdYIoMhERB+DUfI9CuLFc/RYS/9DkLAnnWLK3WnWRkf7NzIbVnCfr9C0tpT666QEr2gvKC+umvVNTJA+Dhma2Pp//teWYH8trJ3XARfHrYoj6FmNoDwxH/fDwZz2uBLEGF558K69sR4Azo/ZsV/a21r/tFvGrsrjGPTN7m8XwfDMkCkWghXMRnvjGpmIM/6MxMl4BWKvagZ5IsJ91mBtEcTKoVP40KlC3fM6B1xwfzs X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAXPR08MB6926.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(396003)(346002)(366004)(376002)(136003)(39860400002)(451199021)(2906002)(8676002)(76116006)(8936002)(5660300002)(52536014)(478600001)(41300700001)(33656002)(316002)(64756008)(66946007)(66476007)(83380400001)(66446008)(66556008)(55016003)(110136005)(4326008)(38070700005)(7696005)(186003)(38100700002)(86362001)(6506007)(26005)(9686003)(55236004)(122000001)(53546011)(71200400001);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9487 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT016.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3e42a978-9e30-45a4-e31e-08db3cc9894c X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oGxFEcj/+OBS/gDdAPtL3HoanGxOLYIMpezMlo8c57+De8wPpOodUGuYVnONAZ5PEsD/YbogHFASf1LFPsAl2GBcrIXMUFFwsJBxHaoaD7UjsmhScqPjNrNW362kDmi47Wf9zpWJrYYcIHVqOFgSZu2rocozoX9PgJ0eSBOPBHI/pfEWaUdbWASeTwN+mEc9TtJ0s4jizuaX+JASmGVZwefFJoMhp5AMrhqFx8l10aB0roG5UOrJI6YmS77oKMWaxojcnJ/aacxlHnkj+fx1YnETMLSY2en4xvZlGmPXq2+D+QaG34oGg0P7NBHxbb/RI/LoT2SDcMwokj39+9w981Ts3p7fwg6MOPwWt0mLWzr9+nSQQlLQuPoAKunIaxUD9F7rspjPVSLbkv1W/AaF4TuXwMgiyt6frLHiGPaBgjS4v99wBDpNSqB+Xk+KzPpWgF4JDdsqWSeJQ6DovykNHJjP8IhwCqfkKprG44laEX13w6ECgJgRFOGe4aldWc9B4EJ+VXEx0g4894USfB50jBQ4fo4SE6vK1QAHzFCpzPBWCHUnM7juymh6GRUVl4BY6TBpwZeWx9TDgFFZt0M+7iq2HZMMib9EvCQDVEYd2ODiyfQFzv5RlgrThNYI6sb0L5/FBmA1Ff4uj+bccsOHocHHHhhNuxbN1b4UjlyhnwAQMf+3tO+z70YfDdetS5XQ25/K+1dJGE25BmBrnFw79Q== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(136003)(396003)(376002)(39860400002)(346002)(451199021)(40470700004)(46966006)(36840700001)(82740400003)(356005)(81166007)(36860700001)(33656002)(86362001)(82310400005)(40460700003)(7696005)(8936002)(40480700001)(478600001)(110136005)(316002)(52536014)(107886003)(5660300002)(55016003)(26005)(6506007)(9686003)(53546011)(186003)(2906002)(336012)(70586007)(70206006)(4326008)(8676002)(41300700001)(47076005)(83380400001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2023 09:20:58.8953 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8116e9a9-daae-4a65-79aa-08db3cc98f0c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT016.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB9948 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Philipp, > -----Original Message----- > From: Philipp Tomsich > Sent: Friday, April 14, 2023 12:22 AM > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov ; Philipp Tomsich > ; Di Zhao > Subject: [PATCH] aarch64: disable LDP via tuning structure for - > mcpu=3Dampere1 >=20 > AmpereOne (-mcpu=3Dampere1) breaks LDP instructions into two uops. > Given the chance that this causes instructions to slip into the next > decoding cycle and the additional overheads when handling > cacheline-crossing LDP instructions, we disable the generation of LDP > isntructions through the tuning structure from instruction combining > (such as in peephole2). >=20 > Given the code-density benefits in builtins and prologue/epilogue > expansion, we allow LDPs there. LDPs are indeed quite an important part of the ISA for code density and the= re are, in principle, second-order benefits from using them, like keeping t= he instruction cache footprint low (which can be important for large worklo= ads). Did you gather some benchmarks showing a benefit of disabling them in this = manner? >=20 > This commit: > * adds a new tuning option AARCH64_EXTRA_TUNE_NO_LDP_COMBINE > * allows -moverride=3Dtune=3D... to override this >=20 > Signed-off-by: Philipp Tomsich > Co-Authored-By: Di Zhao >=20 > gcc/ChangeLog: >=20 > * config/aarch64/aarch64-tuning-flags.def > (AARCH64_EXTRA_TUNING_OPTION): > Add AARCH64_EXTRA_TUNE_NO_LDP_COMBINE. > * config/aarch64/aarch64.cc (aarch64_operands_ok_for_ldpstp): > Check for the above tuning option when processing loads. >=20 > --- >=20 > gcc/config/aarch64/aarch64-tuning-flags.def | 3 +++ > gcc/config/aarch64/aarch64.cc | 8 +++++++- > 2 files changed, 10 insertions(+), 1 deletion(-) >=20 > diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def > b/gcc/config/aarch64/aarch64-tuning-flags.def > index 712895a5263..52112ba7c48 100644 > --- a/gcc/config/aarch64/aarch64-tuning-flags.def > +++ b/gcc/config/aarch64/aarch64-tuning-flags.def > @@ -44,6 +44,9 @@ AARCH64_EXTRA_TUNING_OPTION > ("cheap_shift_extend", CHEAP_SHIFT_EXTEND) > /* Disallow load/store pair instructions on Q-registers. */ > AARCH64_EXTRA_TUNING_OPTION ("no_ldp_stp_qregs", > NO_LDP_STP_QREGS) >=20 > +/* Disallow load-pair instructions to be formed in combine/peephole. */ > +AARCH64_EXTRA_TUNING_OPTION ("no_ldp_combine", > NO_LDP_COMBINE) > + > AARCH64_EXTRA_TUNING_OPTION ("rename_load_regs", > RENAME_LOAD_REGS) >=20 > AARCH64_EXTRA_TUNING_OPTION ("cse_sve_vl_constants", > CSE_SVE_VL_CONSTANTS) > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.c= c > index f4ef22ce02f..8dc1a9ceb17 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -1971,7 +1971,7 @@ static const struct tune_params ampere1a_tunings > =3D > 2, /* min_div_recip_mul_df. */ > 0, /* max_case_values. */ > tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ > - (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ > + (AARCH64_EXTRA_TUNE_NO_LDP_COMBINE), /* tune_flags. */ > &ere1_prefetch_tune > }; >=20 > @@ -26053,6 +26053,12 @@ aarch64_operands_ok_for_ldpstp (rtx > *operands, bool load, > enum reg_class rclass_1, rclass_2; > rtx mem_1, mem_2, reg_1, reg_2; >=20 > + /* Allow the tuning structure to disable LDP instruction formation > + from combining instructions (e.g., in peephole2). */ > + if (load && (aarch64_tune_params.extra_tuning_flags > + & AARCH64_EXTRA_TUNE_NO_LDP_COMBINE)) > + return false; If we do decide to do this, I think this is not a complete approach. See th= e similar tuning flag AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS. There's various other places in the backend that would need to be adjusted = to avoid bringing loads together for the peephole2s to merge (the sched_fus= ion stuff). Plus there's the cpymem expansions that would generate load pairs too... We'd want some testcases added to check that LDPs are blocked too... Thanks, Kyrill > + > if (load) > { > mem_1 =3D operands[1]; > -- > 2.34.1