From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 103627 invoked by alias); 21 Nov 2017 11:57:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 103614 invoked by uid 89); 21 Nov 2017 11:57:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,KB_WAM_FROM_NAME_SINGLEWORD,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.2 spammy=invested, spending, engineers, investment X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com Received: from mail-db5eur01on0084.outbound.protection.outlook.com (HELO EUR01-DB5-obe.outbound.protection.outlook.com) (104.47.2.84) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 21 Nov 2017 11:57:45 +0000 Received: from VI1PR08CA0090.eurprd08.prod.outlook.com (2603:10a6:800:d3::16) by AM3PR08MB0133.eurprd08.prod.outlook.com (2a01:111:e400:8847::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.239.5; Tue, 21 Nov 2017 11:57:41 +0000 Received: from VE1EUR03FT038.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e09::200) by VI1PR08CA0090.outlook.office365.com (2603:10a6:800:d3::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.260.4 via Frontend Transport; Tue, 21 Nov 2017 11:57:41 +0000 Authentication-Results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 217.140.96.140 as permitted sender) receiver=protection.outlook.com; client-ip=217.140.96.140; helo=nebula.arm.com; Received: from nebula.arm.com (217.140.96.140) by VE1EUR03FT038.mail.protection.outlook.com (10.152.19.112) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.20.218.12 via Frontend Transport; Tue, 21 Nov 2017 11:57:40 +0000 Received: from arm.com (10.1.2.79) by mail.arm.com (10.1.106.66) with Microsoft SMTP Server id 14.3.294.0; Tue, 21 Nov 2017 11:57:13 +0000 Date: Tue, 21 Nov 2017 11:59:00 -0000 From: James Greenhalgh To: Jeff Law CC: Wilco Dijkstra , gcc-patches , Richard Earnshaw , Marcus Shawcroft , nd Subject: Re: [RFA][PATCH] Stack clash protection 07/08 -- V4 (aarch64 bits) Message-ID: <20171121115712.GA2499@arm.com> References: <3a6b1bdf-df0f-a512-fd2b-116d57702bc7@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140;IPV:CAL;SCL:-1;CTRY:GB;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(346002)(376002)(39860400002)(2980300002)(438002)(189002)(199003)(24454002)(356003)(6916009)(2950100002)(305945005)(23726003)(77096006)(8676002)(7696004)(2906002)(246002)(6286002)(478600001)(46406003)(26826003)(55016002)(36756003)(5660300001)(53546010)(6246003)(189998001)(104016004)(47776003)(50466002)(72206003)(50986999)(76176999)(54356999)(106466001)(1076002)(106002)(83506002)(97756001)(4326008)(33656002)(54906003)(86362001)(8936002)(316002)(229853002)(16586007)(58126008)(18370500001);DIR:OUT;SFP:1101;SCL:1;SRVR:AM3PR08MB0133;H:nebula.arm.com;FPR:;SPF:Pass;PTR:fw-tnat.cambridge.arm.com;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;VE1EUR03FT038;1:xP/H07Au6dYQPecaqy886WpL5yTrnGRCm+KSOJWqPprXtsCmI71ExHKveCW1a9HoKPzIAoKrNpRERVSxb7kj2fSqPOhYy988wJjO9cpQcRAU+YGCQUIsgp2s7MvruIOj X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 339af818-eebc-4e95-530e-08d530d7113a X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(8251501002)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(2017052603258);SRVR:AM3PR08MB0133; X-Microsoft-Exchange-Diagnostics: 1;AM3PR08MB0133;3:r24R/79qwHva15af2AF6LDz0tZ8dcB6I7Re3zh1BBZBZsJpuHQtm0vlbPZX0lDmShhmeCQo+w2RN4Keam5mdTQYWwlWrrWJpU6AoCWaVYXI+DkhvZOvtfMNW89BWPjC77VW7uwItFyatJ+zLUr+PoQgDAT9rnmKhhR5YveDced7m0OAZ5kdQMODD/Fvn+3G9A9wEJAz5/G3EphfbzLb5gd5BsN2JNzLrwAVBQWPgFtNm9BHvNerdTNspb0BTVGEjDy2v1OouvKJVSmZI25pGBZAG8kXtQr+MIgp1IzGXlxr/SXFoGyNHKHJXd9texp5XKaC5qxj+Sbmy7slnVWrWtU6uo1ndfH9hCkmtG4ouvT8=;25:uLc0wdbB/Tou1CL3FNuFE1FI729VQT5aUAtPwadXCNurtCJwD7mbSRYsENGvxSVd7eGdma0GspH8dskC0LtAFmvLV74dzd86xdmzXCyJmjnwOySofsWpG2fcq7Wwdm0qYfOi30tnBUJO3NLbhi1G1+6Ei0HX/u3SPCVM+UN/V6wloOEpJHFMnYnfDw8rjl6Y91nJutySkcY6jc93YfOZqUNxwBUFDxQ5zGJg3e9aFuH6uaFnrtArToFTkgzPZtJUECOeIG+OMBHajrnvLVod3CVCsUT1FMRGdDyZX+DfzMF5kZG9ZnnWOHjaoWd1HOWOGWin61KgSeYLUZWQMWtHZfob0xSqI6YbKJByOI96AZU= X-MS-TrafficTypeDiagnostic: AM3PR08MB0133: X-Microsoft-Exchange-Diagnostics: 1;AM3PR08MB0133;31:PqsIoCZTQgmNlaw5efUcvP239OWZXOD3JT2DQKPdGJ6gL9PBGFXNrbWxQPiOnXsWm1zlTpKNrqDsFUVBHh7Kte5gp3CXiufeIVQ3hCtBi/PsTiI2dgfMC1ThyoMLcHLUeNuQj5zSQsU9vMTX1c9Bygp2hvRS9xDeoh9YsitazRjH6XnX6kp77l/3voJUfCz3iwOVwS0xsTK9sYTUhDFxSKplUGzbSCBzt6bzGDbN9CI=;20:aeYHuUvNvj3BPX+rt27NwWvPByEiAbT/B/HA82sfK9CNVE3BPSKBY1KhdyRPJaJxBHl8lED/Pj3ps8PshTm75UwGfrHt5g/pvkf42OmJHlwei4DSna4hMXxc3czNJZp8QhBdDYXDt6L3AFm4JbcWciP7zItgrp6ISAA//smtwZZ1F2jicXUSbChNtdR9PNztZo7HOMtxlaP9Av3S4g+GnUR7ro7Sx7g2KYQfeNr2/avUUMhlYwQikMCap/YfeHyb;4:woCrOzk2oHTe1u9H/isgV2lFsgxIzE+5MySGCIE6ucFirpnB+rbfzDnGrqXPuVkUxZPYMyTDUBwofHfMy59bYpkerpGusONAYAvjkZehN17YKWTdSum8qp1qUd9LlV9wUTjN2J9JiEsI9yENAg4QWXCIpe5PObavZM+5uqdtmD+oRJFprOfsoiIgLu1B5iu5yA4vdLihL9FofO72iX4cIqL2/GElkStX+m/53Lzn/JzzmCvQzvFidpXHr/sqFvIE3S6ee3aunfz5UbJ0fv5n5g== NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(100000703101)(100105400095)(93006095)(93004095)(10201501046)(3002001)(3231022)(6055026)(6041248)(20161123555025)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123564025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:AM3PR08MB0133;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:AM3PR08MB0133; X-Forefront-PRVS: 049897979A X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;AM3PR08MB0133;23:0F8oF7nPQjr//OnkJoq+ODhV6EQMZTDWgP3zv6Ok2?= =?us-ascii?Q?SC+LdcsExgGw8tfUe+pAVqNMUL6YWXX0pkhq93S1Lshvew8eV6dky6JeQdvB?= =?us-ascii?Q?x51J7QHtFAdKIedWeQKntx1TyZCChg0sFuIRljYmY0kJ4bWEWmaIiAeQ3hme?= =?us-ascii?Q?WC4/qgPCeNPN3ZfWJr0GaqQwqhW++zm70QBFNVtq92qTT1nLt3Vfp/ERLriC?= =?us-ascii?Q?oaVyYtx0UL5S5SmDDhmf3SxqIMsD9f/AOylgf5EEx9dLuUhW2IsqfuWoPKP3?= =?us-ascii?Q?iSs99v5THVaXNd9h0SUb5Y6JB6cm1YPxpRCq97mGT+kqyU7dGM8j68x5IKWH?= =?us-ascii?Q?c31sb80KUinp8Woe2D6YPfg27enMlozYXkNi21ZxRrXpkWUMgfBRLqFf42kW?= =?us-ascii?Q?PGOfTCBxpB046YXRXcJdwRZR79zjNpbo5DwYhYhDdix0/ldxUNjAJd5RIAVD?= =?us-ascii?Q?rEQhEdiPWaxnotFiHRA3CQcY1HVfLUxhMZqZmpXCXS+tkXA3L54FvUAA0c29?= =?us-ascii?Q?MT2PiCFZIZko7RmS/M1MfKlk/Ub59P/jUPAbYuf6kr9g2VDmpa79kWUnNarB?= =?us-ascii?Q?8aWl46fPHGtgSklict4vPcoHEJHCwOmWw936BPIXKNFQDl8BRNoxl1ovbAme?= =?us-ascii?Q?ZftDhqpev4nfV0g2kbVLPG5KLN50+sozrp0M8UMU5nxPkUl81i+bAha7yspe?= =?us-ascii?Q?qsyhb0S3ydpK6CtmVoA2l66zdKLs3jDwjsHOsWq0YyG9lIQjiqXWIWK3PYN+?= =?us-ascii?Q?8zVJAJFCP7FTahbk1ONsH0GC78QNNDr30uMIGVLqs4nSG+uXgDQr9g2PlOGz?= =?us-ascii?Q?FMNkbBZLi1GsM2f51eK6PiG3i35/TP6mzEfkS+3AltZEjFAGkZoqZ0Fp1UOu?= =?us-ascii?Q?Ocu/TXrmMC5UmywJ4TnC6Gq7P2ZbdR5bQ48EgqcxzmAUKmGT50trSPX9JpWW?= =?us-ascii?Q?iiYcWe4A/L3LGr1cL8R7Ih3denTrHfbJRF3ffT5Q4gvH0V30QDoIXnCOeXhD?= =?us-ascii?Q?LwB+f94FUmHFvw5eV7J5/20vxiq65glyB4oYlm+mO4tkWffeyIsXgLDoAZq5?= =?us-ascii?Q?FTwyBHqWNk5j93XRGJ0Mw6h/vCx3mXWbykYiU5kpmEDLD7DJl9R07JT+f77a?= =?us-ascii?Q?1RnAOIw/6F6WTdm0BCDWg/LaEKCAYrWv8dhT/664m9pEm+p97WPGxvFWoPNM?= =?us-ascii?Q?LklfoKYQFHvmqs=3D?= X-Microsoft-Exchange-Diagnostics: 1;AM3PR08MB0133;6:laOY93IBB4WnKObwl/YbngVswRk0+il6v82qbn+sJ88YQP220ftoJR2zc1+JJjrkhTCAaukKW/Hde9epCTQMHwyLbpTbCQFR3rswGsqfqKVA3sdEnEoOdpxO4WVNgiC/yuP3qxsmVj72qzvCQHyQsk+DP+z6EfsqK6Ol2USNWjf84qZWebfTOFoItgHXfLZnK2rLcenT4mVLsRQVAQWTEJYN2kxcJkELyzYJE4wMNc68OES8I1hL5jKAmziHCrgqsrP5YkGkXheNGqcwbvcgQpnHq2rGx4RHQMlRC3GLpCrxuwsMWDgk4xLVfhYgqzTqkjXHJVQApg3/Dg8zFlk7+LsDqkH53pLHiG0venWtVSw=;5:QDJW8Zj9f3LXJ7q9TcVop/symWtS/HA3eXedjYgUSkFXULQvfheAz29eW9DvjxYTzEAq6AMzLHnVAnT0RPyc4tlSJm2AZTheD5QI/vhTUnE2RMvyBQxKRbJycWdy3klfMNzgcDW9V626v/nIWhXOTCAw4/u0xNAVt/XhBRPqYTM=;24:0vMw3ahtc702EdSpjf+5T0pY6w9hVjUlhEb5amibPmWyVYtgBauz5Dq+Z2ZOUhzj3mSbTzf3JrGinpejiOagenAQqi11uSOjCH1CCg9++KE=;7:0nM0UTS4V5VLf8F1kU9BdB9INdiXESMSYU51FpBBksTHmOXqHR8VJWxcOoFmo8o0X0AqXwVwcie1W7+uYK8h4lp6y1vrZQmOaNYr//DpzEYmK72412hXMQ5/JmEqR98e4hS8jxjqRvP5lbDYyQ10UC1gc9bzpyIThwPuomo0LhxATLF5AV5dhZ96QIk9txBK3Fgt2iYyML572JuYQnU+xjg/xG6LzWdeJmVq/gI/nppajOLzHPVEpF0kaOf/Wxi8 SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Nov 2017 11:57:40.0273 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 339af818-eebc-4e95-530e-08d530d7113a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[217.140.96.140];Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM3PR08MB0133 X-IsSubscribed: yes X-SW-Source: 2017-11/txt/msg01882.txt.bz2 I've finally built up enough courage to start getting my head around this... I see one outstanding issue sitting on this patch version: On Sat, Oct 28, 2017 at 05:08:54AM +0100, Jeff Law wrote: > On 10/13/2017 02:26 PM, Wilco Dijkstra wrote: > > --param=stack-clash-protection-probe-interval=13 > > --param=stack-clash-protection-guard-size=12 > > > > So if there is a good reason to continue with 2 separate values, we must > > force probe interval <= guard size! > The param code really isn't designed to enforce values that are > inter-dependent. It has a min, max & default values. No more, no less. > If you set up something inconsistent with the params, it's simply not > going to work. > > > > > > Also on AArch64 --param=stack-clash-protection-probe-interval=16 causes > > crashes due to the offsets used in the probes - we don't need large offsets > > as we want to probe close to the bottom of the stack. > Not a surprise. While I tried to handle larger intervals, I certainly > didn't test them. Given the ISA I wouldn't expect an interval > 12 to > be useful or necessarily even work correctly. Understood - weird behaviour with weird params don't concern me. > > Functions with a large stack emit like alloca a lot of code, here I used > > --param=stack-clash-protection-probe-interval=15: > > > > int f1(int x) > > { > > char arr[128*1024]; > > return arr[x]; > > } > > > > f1: > > mov x16, 64512 > > sub sp, sp, x16 > > .cfi_def_cfa_offset 64512 > > mov x16, -32768 > > add sp, sp, x16 > > .cfi_def_cfa_offset -1024 > > str xzr, [sp, 32760] > > add sp, sp, x16 > > .cfi_def_cfa_offset -66560 > > str xzr, [sp, 32760] > > sub sp, sp, #1024 > > .cfi_def_cfa_offset -65536 > > str xzr, [sp, 1016] > > ldrb w0, [sp, w0, sxtw] > > .cfi_def_cfa_offset 131072 > > add sp, sp, 131072 > > .cfi_def_cfa_offset 0 > > ret > > > > Note the cfa offsets are wrong. > Yes. They definitely look wrong. There's a clear logic error in > setting up the ADJUST_CFA note when the probing interval is larger than > 2**12. That should be easily fixed. Let me poke at it. This one does concern me, how did you get on? Did it respond well to prodding? > > There is an odd mix of a big initial adjustment, then some probes+adjustments and > > then a final adjustment and probe for the remainder. I can't see the point of having > > both an initial and remainder adjustment. I would expect this: > > > > sub sp, sp, 65536 > > str xzr, [sp, 1024] > > sub sp, sp, 65536 > > str xzr, [sp, 1024] > > ldrb w0, [sp, w0, sxtw] > > add sp, sp, 131072 > > ret > I'm really not able to justify spending further time optimizing the > aarch64 implementation. I've done the best I can. You can take the > work as-is or improve it, but I really can't justify further time > investment on that architecture. Makes sense. Understood. And certainly not required to land this patch. > > int f2(int x) > > { > > char arr[128*1024]; > > return arr[x]; > > } > > > > f2: > > mov x16, 64512 > > sub sp, sp, x16 > > mov x16, -65536 > > movk x16, 0xfffd, lsl 16 > > add x16, sp, x16 > > .LPSRL0: > > sub sp, sp, 4096 > > str xzr, [sp, 4088] > > cmp sp, x16 > > b.ne .LPSRL0 > > sub sp, sp, #1024 > > str xzr, [sp, 1016] > > ldrb w0, [sp, w0, sxtw] > > add sp, sp, 262144 > > ret > > > > The cfa entries are OK for this case. There is a mix of positive/negative offsets which > > makes things confusing. Again there are 3 kinds of adjustments when for this size we > > only need the loop. > > > > Reusing the existing gen_probe_stack_range code appears a bad idea since > > it ignores the probe interval and just defaults to 4KB. I don't see why it should be > > any more complex than this: > > > > sub x16, sp, 262144 // only need temporary if > 1MB > > .LPSRL0: > > sub sp, sp, 65536 > > str xzr, [sp, 1024] > > cmp sp, x16 > > b.ne .LPSRL0 > > ldrb w0, [sp, w0, sxtw] > > add sp, sp, 262144 > > ret > > > > Probe insertion if final adjustment >= 1024 also generates a lot of redundant > > code - although this is more a theoretical issue given this is so rare. > Again, if ARM wants this optimized, then ARM's engineers are going to > have to take the lead here. I've invested all I can reasonably invest > in terms of trying optimize the probing for this target. Likewise here - thanks for your work so far, I have no expectation of this being fully optimized before I OK it to land. Sorry for the big delay getting round to this patch, I hope to get serious time to put in to it later this week, and it would be helpful to close out the few remaining issues before I do. Thanks, James