From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25707 invoked by alias); 1 Aug 2017 10:18:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 25653 invoked by uid 89); 1 Aug 2017 10:18:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com Received: from mail-db5eur01on0089.outbound.protection.outlook.com (HELO EUR01-DB5-obe.outbound.protection.outlook.com) (104.47.2.89) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 01 Aug 2017 10:18:39 +0000 Received: from HE1PR0801MB2058.eurprd08.prod.outlook.com (10.168.95.23) by AM4PR08MB2657.eurprd08.prod.outlook.com (10.171.190.146) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.1.1304.22; Tue, 1 Aug 2017 10:18:35 +0000 Received: from HE1PR0801MB2058.eurprd08.prod.outlook.com ([fe80::896b:eb4a:ed05:60f1]) by HE1PR0801MB2058.eurprd08.prod.outlook.com ([fe80::896b:eb4a:ed05:60f1%18]) with mapi id 15.01.1304.022; Tue, 1 Aug 2017 10:18:35 +0000 From: Wilco Dijkstra To: GCC Patches , James Greenhalgh , Jeff Law CC: nd Subject: Re: [PATCH][AArch64] Simplify frame layout for stack probing Date: Tue, 01 Aug 2017 10:18:00 -0000 Message-ID: References: In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;AM4PR08MB2657;7:O9x26tYHR7S17pl5JG/pCQ3+htwK8cKDCZXCW4besLWiJJIQdG4h0zjoELTVWsoIVH5YDLq/jPbWe8fcrFVMq5jTzSZSJTXao4YMFQQNN4tbjKkvSNc/H73EHf2p1NGyQRYSgTRCAhd0vuQf7iGfnY+y0eq3h4P2FjFkln9kb7Bi0mOapQh/nAcBSbJN2G1oAZE5rRUEEUwx6zaTe46eKnAS0lphpsp70HW+n4S5aT4mTUu2Lcr1p/71Xa3h/ydqRPNYTdo4T+dbkjFvYug696tHc9IHpw8Tpsaik5/PG6GkAkECMNDVVYBzpTThU0HBIB57OjL5SnIMiOCSu+RljD0TNG2cedtGgkDZzm1gdwXZmY+O4SciIBmYr14woubKXiBPUs8nWr7U8BbcM9KPk4IfIfn4Of6kGj13R4129Yf7ukvQEHC4MkbZ1IG50rBA4OSOadcJ+Uv2NDnKroy+tmpdc/jr69sQZfIfg4Hu7o+k0AwxLFbzgrkTq3kOinPitJifNF+/X+Tel2DvCKsNIm2WEhIU/pShlLh4JpZVTBIgHbqv9N+4eLdE1Z9vdLPtrSuuJoh5QfQCrDB4Xhvbp2iVwDnx0Wn8XriQWFAMcZqoT5UFGZ3Wyopd7mEqVRVW3ktXeSb19BoFvro9npfwearXwnNWF65gFtzvaW0J/4RJLCECKNF6IqWG5VGkLtNkgvxStcbwTbGO+lCvfvk6IBcuTVjUUf27Ty3bUI2gy7xTezSx2MrfvbXPB2Gk3YTdz7Et80pNpLeFs/zJvbZj2/8k86xhVy4g3E8YTN8vXEk= x-ms-office365-filtering-correlation-id: 31be6bea-6130-4ec4-5fd4-08d4d8c6ab5b x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(48565401081)(300000503095)(300135400095)(2017052603031)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:AM4PR08MB2657; x-ms-traffictypediagnostic: AM4PR08MB2657: nodisclaimer: True x-exchange-antispam-report-test: UriScan:(180628864354917); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(93006095)(93001095)(3002001)(10201501046)(6055026)(6041248)(20161123564025)(201703131423075)(201703061421075)(20161123562025)(20161123555025)(20161123560025)(20161123558100)(6072148)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:AM4PR08MB2657;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:AM4PR08MB2657; x-forefront-prvs: 0386B406AA x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(39450400003)(39400400002)(54534003)(189002)(377424004)(199003)(4326008)(2950100002)(33656002)(14454004)(53546010)(189998001)(7736002)(97736004)(305945005)(101416001)(8936002)(74316002)(50986999)(5660300001)(7696004)(3280700002)(81156014)(81166006)(8676002)(5250100002)(72206003)(76176999)(54356999)(3660700001)(68736007)(6246003)(106356001)(508600001)(38730400002)(6506006)(9686003)(2900100001)(105586002)(2906002)(3846002)(102836003)(6116002)(575784001)(25786009)(86362001)(66066001)(6436002)(229853002)(53936002)(99286003)(55016002);DIR:OUT;SFP:1101;SCL:1;SRVR:AM4PR08MB2657;H:HE1PR0801MB2058.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Aug 2017 10:18:35.5005 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR08MB2657 X-SW-Source: 2017-08/txt/msg00049.txt.bz2 ping From: Wilco Dijkstra Sent: 25 July 2017 14:58 To: GCC Patches; James Greenhalgh; Jeff Law Cc: nd Subject: [PATCH][AArch64] Simplify frame layout for stack probing =A0=20=20=20 This patch makes some changes to the frame layout in order to simplify stack probing.=A0 We want to use the save of LR as a probe in any non-leaf function.=A0 With shrinkwrapping we may only save LR before a call, so it is useful to define a fixed location in the callee-saves. So force LR at the bottom of the callee-saves even with -fomit-frame-pointer. Also remove a rarely used frame layout that saves the callee-saves first with -fomit-frame-pointer. OK for commit (and backport to GCC7)? ChangeLog: 2017-07-25=A0 Wilco Dijkstra=A0 =A0=A0=A0=A0=A0=A0=A0 * config/aarch64/aarch64.c (aarch64_layout_frame): =A0=A0=A0=A0=A0=A0=A0 Ensure LR is always stored at the bottom of the calle= e-saves. =A0=A0=A0=A0=A0=A0=A0 Remove frame option which saves callee-saves at top o= f frame. -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index b8a4160d9de8e689ccd26cb9f0ce046ee65e0ef4..3fc36ae28d18b9635480fd99f1f= a7719267e66e4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2875,7 +2875,8 @@ aarch64_frame_pointer_required (void) =A0 =A0/* Mark the registers that need to be saved by the callee and calculate =A0=A0=A0 the size of the callee-saved registers area and frame record (bot= h FP -=A0=A0 and LR may be omitted).=A0 */ +=A0=A0 and LR may be omitted).=A0 If the function is not a leaf, ensure LR= is +=A0=A0 saved at the bottom of the callee-save area.=A0 */ =A0static void =A0aarch64_layout_frame (void) =A0{ @@ -2926,7 +2927,14 @@ aarch64_layout_frame (void) =A0=A0=A0=A0=A0=A0 cfun->machine->frame.wb_candidate1 =3D R29_REGNUM; =A0=A0=A0=A0=A0=A0 cfun->machine->frame.reg_offset[R30_REGNUM] =3D UNITS_PE= R_WORD; =A0=A0=A0=A0=A0=A0 cfun->machine->frame.wb_candidate2 =3D R30_REGNUM; -=A0=A0=A0=A0=A0 offset +=3D 2 * UNITS_PER_WORD; +=A0=A0=A0=A0=A0 offset =3D 2 * UNITS_PER_WORD; +=A0=A0=A0 } +=A0 else if (!crtl->is_leaf) +=A0=A0=A0 { +=A0=A0=A0=A0=A0 /* Ensure LR is saved at the bottom of the callee-saves.= =A0 */ +=A0=A0=A0=A0=A0 cfun->machine->frame.reg_offset[R30_REGNUM] =3D 0; +=A0=A0=A0=A0=A0 cfun->machine->frame.wb_candidate1 =3D R30_REGNUM; +=A0=A0=A0=A0=A0 offset =3D UNITS_PER_WORD; =A0=A0=A0=A0 } =A0 =A0=A0 /* Now assign stack slots for them.=A0 */ @@ -3025,20 +3033,6 @@ aarch64_layout_frame (void) =A0=A0=A0=A0=A0=A0 cfun->machine->frame.final_adjust =A0=A0=A0=A0=A0=A0=A0=A0 =3D cfun->machine->frame.frame_size - cfun->machin= e->frame.callee_adjust; =A0=A0=A0=A0 } -=A0 else if (!frame_pointer_needed -=A0=A0=A0=A0=A0=A0=A0=A0=A0 && varargs_and_saved_regs_size < max_push_offs= et) -=A0=A0=A0 { -=A0=A0=A0=A0=A0 /* Frame with large local area and outgoing arguments (thi= s pushes the -=A0=A0=A0=A0=A0=A0=A0 callee-saves first, followed by the locals and outgo= ing area): -=A0=A0=A0=A0=A0=A0=A0 stp reg1, reg2, [sp, -varargs_and_saved_regs_size]! -=A0=A0=A0=A0=A0=A0=A0 stp reg3, reg4, [sp, 16] -=A0=A0=A0=A0=A0=A0=A0 sub sp, sp, frame_size - varargs_and_saved_regs_size= =A0 */ -=A0=A0=A0=A0=A0 cfun->machine->frame.callee_adjust =3D varargs_and_saved_r= egs_size; -=A0=A0=A0=A0=A0 cfun->machine->frame.final_adjust -=A0=A0=A0=A0=A0=A0 =3D cfun->machine->frame.frame_size - cfun->machine->fr= ame.callee_adjust; -=A0=A0=A0=A0=A0 cfun->machine->frame.hard_fp_offset =3D cfun->machine->fra= me.callee_adjust; -=A0=A0=A0=A0=A0 cfun->machine->frame.locals_offset =3D cfun->machine->fram= e.hard_fp_offset; -=A0=A0=A0 } =A0=A0 else =A0=A0=A0=A0 { =A0=A0=A0=A0=A0=A0 /* Frame with large local area and outgoing arguments us= ing frame pointer: =20=20=20=20