From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 93499 invoked by alias); 29 Jan 2016 15:18:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 93458 invoked by uid 89); 29 Jan 2016 15:18:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=H*M:eurprd07, H*c:Windows-1252, HX-OriginatorOrg:sk:sct-15-, Hx-exchange-antispam-report-cfa-test:82015046 X-HELO: DUB004-OMC1S35.hotmail.com Received: from dub004-omc1s35.hotmail.com (HELO DUB004-OMC1S35.hotmail.com) (157.55.0.234) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA256 encrypted) ESMTPS; Fri, 29 Jan 2016 15:18:35 +0000 Received: from emea01-am1-obe.outbound.protection.outlook.com ([157.55.0.237]) by DUB004-OMC1S35.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Fri, 29 Jan 2016 07:18:32 -0800 Received: from HE1PR07MB0905.eurprd07.prod.outlook.com (10.162.26.12) by HE1PR07MB0907.eurprd07.prod.outlook.com (10.162.26.14) with Microsoft SMTP Server (TLS) id 15.1.365.19; Fri, 29 Jan 2016 15:18:31 +0000 Received: from HE1PR07MB0905.eurprd07.prod.outlook.com ([10.162.26.12]) by HE1PR07MB0905.eurprd07.prod.outlook.com ([10.162.26.12]) with mapi id 15.01.0365.024; Fri, 29 Jan 2016 15:18:30 +0000 From: Bernd Edlinger To: Bernd Schmidt , Eric Botcazou , Richard Sandiford CC: "gcc-patches@gcc.gnu.org" , Matthew Fortune , Nick Clifton Subject: Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed) Date: Fri, 29 Jan 2016 15:18:00 -0000 Message-ID: References: <6D39441BF12EF246A7ABCE6654B023536A705EB0@LEMAIL01.le.imgtec.org> <87y4bcavl5.fsf_-_@googlemail.com> <1634352.Ak3Qm2WKut@polaris> <56AABBC5.6090309@redhat.com> In-Reply-To: <56AABBC5.6090309@redhat.com> authentication-results: redhat.com; dkim=none (message not signed) header.d=none;redhat.com; dmarc=none action=none header.from=hotmail.de; x-ms-exchange-messagesentrepresentingtype: 1 x-microsoft-exchange-diagnostics: 1;HE1PR07MB0907;5:0FYLQmcDO6oPwqqYUUdk9gNRKWwGmO7Vua0igYnt5UPcjyRSx3Rbb/gI6gcImVwa4wB6YFg6lfFksuDid3uYyhR5QSTYXpB7OLUMnS759MF44lz/Y7kB2OF1Djicfxzh8JlqFhQcfmw2ypHp2VionQ==;24:z+onkT99Fwt2K0jyYw/azpsxiqMWTtfHXNT0zzzc53gGFJWP84HvZRpDgRvw3JqQimQUH2j0us7C0/O6cB3hXRzPUvMrkdlAVnjJzLBf5a4= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR07MB0907; x-ms-office365-filtering-correlation-id: 2fce66c5-6429-4470-9f15-08d328bf6d77 x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(432015012)(82015046);SRVR:HE1PR07MB0907;BCL:0;PCL:0;RULEID:;SRVR:HE1PR07MB0907; x-forefront-prvs: 083691450C x-forefront-antispam-report: SFV:NSPM;SFS:(7070004)(6009001)(377454003)(24454002)(479174004)(77096005)(102836003)(586003)(86362001)(4326007)(122556002)(5890100001)(92566002)(93886004)(19580405001)(74316001)(87936001)(10400500002)(4000100100001)(106116001)(5008740100001)(2900100001)(40100003)(76576001)(75402003)(73972006)(2950100001)(50986999)(82202001)(54356999)(33656002)(74482002)(5001770100001)(5002640100001)(189998001)(3470700001)(5003600100002)(3280700002)(1220700001)(3660700001)(76176999);DIR:OUT;SFP:1901;SCL:1;SRVR:HE1PR07MB0907;H:HE1PR07MB0905.eurprd07.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="Windows-1252" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: sct-15-1-318-9-msonline-outlook-efc2f.templateTenant X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Jan 2016 15:18:29.9637 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR07MB0907 X-SW-Source: 2016-01/txt/msg02314.txt.bz2 On 29.01.2016 02:09, Bernd Schmidt wrote: > On 01/28/2016 12:36 AM, Eric Botcazou wrote: >>> [cc-ing Eric as RTL maintainer] >> >> Sorry for the delay, the message apparently bounced] >> >>> IMO the problem is that rtx_addr_can_trap_p_1 duplicates a large >>> bit of LRA/reload logic: >>> >>> [...] >>> >>> Under the current interface macros like INITIAL_ELIMINATION_OFFSET >>> are expected to trigger the calculation of the target's frame layout >>> (since you need that information to answer the question). >>> To me it seems wrong that we're attempting to call that sort of >>> macro in a query routine like rtx_addr_can_trap_p_1. >> >> I'm a little uncomfortable stepping in here because, while I >> essentially share >> your objections and was opposed to the patch (I was almost sure that >> it would >> introduce maintainance issues for no practical benefit), I didn't >> imagine that >> such a failure mode would have been possible (computing an >> approximation of >> the frame layout after reload being problematic) so I didn't really >> object to >> being overruled after seeing Bernd's patch... >> >>> IMO we should cache the information we need@the start of each >>> LRA/reload cycle. This will be more robust, because there will >>> be no accidental changes to global state either during or after >>> LRA/reload. It will also be more efficient because >>> rtx_addr_can_trap_p_1 can read the cached variables rather >>> than calling back into the target. >> >> That would be a better design for sure and would eliminate the >> maintainance >> issues I was originally afraid of. >> >> My recommendation would be to back out Bernd's patch for GCC 6.0 >> (again, it >> doesn't fix any regression and, more importantly, any bug in real >> software, >> but only corner case bugs in dumb computer-generated testcases) but >> with the >> commitment to address the underlying issue for GCC 7.0 and backport >> the fix to >> GCC 6.x unless really impracticable. That being said, it's ultimately >> Jakub >> and Richard's call. > > I'm on the fence; I do think the original problem is an issue we should > fix, but I'm also not terribly happy with the implementation we have > right now. Besides the issues already mentioned, doesn't it kind of > assume these offsets are constant (which they definitely aren't, > consider arg pushes for example)? > Yes that is right. I saw it as a thing that could possibly happen more often than we know, because it is difficult to spot a wrong code by ordinary tests. Even the reproducer from pr61047 does not crash when it runs in the gcc testsuite, I had to tweak the example first to use a larger offset to compensate the large number of environment values that are passed from the test suite in each test execution. Yes, rtx_addr_can_trap_p_1 does not know how many bytes are pushed on the stack and I saw no easy way how to get to the REG_NOTES for instance, because they are attached to the INSN and rtx_addr_can_trap_p has only access to a MEM rtx. It is no exact science, but the error is on the safe side. Nevertheless, with the data from the target hook the approximation is good enough, to not pessimize any single bit of code that is generated in stage2 vs. stage3, which would not have been the case without some help from the target. > I think a better approach might be to just mark accesses at known > locations in the frame, or arg pushes, as MEM_NOTRAP_P, and consider > accesses with non-constant or calculated offsets as potentially trapping. > Yes I think also that might be a next step. It would probably be good to somehow fixate the result of rtx_addr_can_trap_p immediately before reload when the RTX's are still FRAMEP+x and ARGP+x, and annotate that somehow to the reloaded RTX's, that way it would finally be superfluous to call the target hook at all, because the actual addresses should not change during or after reload. It will probably have to be a spare bit on the RTX that is currently unused, because MEM_NOTRAP_P is already used for something different. It will however not be simple to find a valid piece of C code where the current implementation with all of its limitations generates different code compared to an implementation that has access to the exact offsets. As I said, I tried already, but could not find an example of a missed optimization due to my patch. Thanks Bernd. > > Bernd