From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11739 invoked by alias); 27 Oct 2015 15:26:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 11723 invoked by uid 89); 27 Oct 2015 15:26:48 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 27 Oct 2015 15:26:47 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E01FE3A1; Tue, 27 Oct 2015 08:26:38 -0700 (PDT) Received: from [10.2.206.22] (e104437-lin.cambridge.arm.com [10.2.206.22]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0C1523F21A; Tue, 27 Oct 2015 08:26:44 -0700 (PDT) Subject: Re: PING: [PATCH] PR target/67215: -fno-plt needs improvements for x86 To: "H.J. Lu" References: <562F5E11.1090503@redhat.com> <562F739F.2090000@foss.arm.com> <562F818A.90003@foss.arm.com> <562F8B6F.7060605@foss.arm.com> Cc: Ramana Radhakrishnan , Bernd Schmidt , GCC Patches , Jeff Law From: Jiong Wang Message-ID: <562F97B3.7060408@foss.arm.com> Date: Tue, 27 Oct 2015 15:27:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-10/txt/msg02916.txt.bz2 On 27/10/15 14:50, H.J. Lu wrote: > On Tue, Oct 27, 2015 at 7:34 AM, Ramana Radhakrishnan > wrote: >>> OK, then it's fairly x86-64 specific optimization, because we can't do "call *mem" in >>> aarch64 and some other targets. >> It is a fairly x86_64 specific optimization and doesn't apply to AArch64. >> >> The question really is what impact does removing the generic code handling have on aarch64 - is it a no-op or not for the existing -fno-plt implementation in the AArch64 backend ? The only case that is of interest is the bit below in calls.c and it looks like that may well be redundant with the logic in the backend already, but I have not done the full analysis to convince myself that the code in the backend is sufficient. >> >> - && (!flag_plt >> - || lookup_attribute ("noplt", DECL_ATTRIBUTES (fndecl_or_type))) >> - && !targetm.binds_local_p (fndecl_or_type)) >> > -fno-plt is a backend specific optimization and should be handled > in backend. > The removing of those generic code has broken aarch64. Actually those code in calls.c shouldn't prevent such "call *mem" opportunity on x86-64 because the combine pass should combine "load reg, symbol + call reg" back into "call *mem" on x86-64 as there is related define_insn. the testcases in PR67215 and included in your patch, all of which are loops, failed because either RTL PRE or loop pass will hoist address calculation pattern as invariant out of loop into another basic block different with the call_insn. while combine pass only work within basic block scope, thus we have missed such combine opportunity on x86-64. I am not sure anyone has done experiment before on extend combine pass to larger scope.