From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26076 invoked by alias); 2 Jun 2015 21:22:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 26062 invoked by uid 89); 2 Jun 2015 21:22:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_50,KAM_STOCKGEN,RCVD_IN_DNSWL_LOW,SPF_PASS,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mail-ie0-f175.google.com Received: from mail-ie0-f175.google.com (HELO mail-ie0-f175.google.com) (209.85.223.175) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 02 Jun 2015 21:22:06 +0000 Received: by iesa3 with SMTP id a3so143282693ies.2 for ; Tue, 02 Jun 2015 14:22:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=xucnzAMvXlFQlACxABvIBWNdMpDEhcQ18hGUfZ8aPAs=; b=k3+hqIXhLvbWjl2XE8O/PCYrZYGvonwY4sq68OLHcBz3kYDffwLgh6zJZ45W4jdyW5 cPIQlVGB2x+HhwlIThQgztOwpEgbd36v9c74LkEfkycOevKa1NuQ/Q4TB2ijpUf5sLOT 7HhI73joH0XlbsVaIq5cQ60WUIjCfoJRQKzPiksWWvroVW+G9IvmjhQn/XhvHHR1ADjY 0UmWaShGchKHPC/spSh5ws5+MfVbFmT9KXifz5ADpAc2KGgOi/rRNbB1Z0B4A1AEI4xg 6D2H/bATrv/3LUZ61AMgKzWdfqmso8ywZ1mVeBzN4J6d/7pPPqcDKmGxO1R1N+pMbDtJ Rqkg== X-Gm-Message-State: ALoCoQm/lWNm0D3m8FbYRDj7d3lcoK5YYweOIcjnw7l+4ROXqNm4hvp9y5DLbJ02iGsTTYpX1WUk MIME-Version: 1.0 X-Received: by 10.107.41.14 with SMTP id p14mr35761802iop.58.1433280123890; Tue, 02 Jun 2015 14:22:03 -0700 (PDT) Received: by 10.107.52.5 with HTTP; Tue, 2 Jun 2015 14:22:03 -0700 (PDT) In-Reply-To: References: <20150529193552.GA52215@kam.mff.cuni.cz> <556C16B1.5080606@arm.com> Date: Tue, 02 Jun 2015 21:25:00 -0000 Message-ID: Subject: Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt= From: Xinliang David Li To: Ramana Radhakrishnan Cc: Sriraman Tallam , Ramana Radhakrishnan , Jan Hubicka , "H.J. Lu" , Pedro Alves , Michael Matz , GCC Patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-06/txt/msg00255.txt.bz2 On Tue, Jun 2, 2015 at 1:56 PM, Ramana Radhakrishnan wrote: > On Tue, Jun 2, 2015 at 7:15 PM, Sriraman Tallam wrote: >> On Mon, Jun 1, 2015 at 1:33 PM, Ramana Radhakrishnan >> wrote: >>> On Mon, Jun 1, 2015 at 7:55 PM, Sriraman Tallam wrote: >>>> On Mon, Jun 1, 2015 at 11:41 AM, Ramana Radhakrishnan >>>> wrote: >>>>> On Mon, Jun 1, 2015 at 7:01 PM, Sriraman Tallam wrote: >>>>>> On Mon, Jun 1, 2015 at 1:24 AM, Ramana Radhakrishnan >>>>>> wrote: >>>>>>> >>>>>>>>> Why isn't it just an indirect call in the cases that would require a GOT >>>>>>>>> slot and a direct call otherwise ? I'm trying to work out what's so >>>>>>>>> different on each target that mandates this to be in the target backend. >>>>>>>>> Also it would be better to push the tests into gcc.dg if you can and >>>>>>>>> check >>>>>>>>> for the absence of a relocation so that folks at least see these as being >>>>>>>>> UNSUPPORTED on their target. >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> To be even more explicit, shouldn't this be handled similar to the way in >>>>>>> which -fno-plt is handled in a target agnostic manner ? After all, if you >>>>>>> can handle this for the command line, doing the same for a function which >>>>>>> has been decorated with attribute((noplt)) should be simple. >>>>>> >>>>>> -fno-plt does not work for non-PIC code, having non-PIC code not use >>>>>> PLT was my primary motivation. Infact, if you go back in this thread, >>>>>> I suggested to HJ if I should piggyback on -fno-plt. I tried using >>>>>> the -fno-plt implementation to do this by removing the flag_pic check >>>>>> in calls.c, but that does not still work for non-PIC code. >>> >>> If you want __attribute__ ((noplt)) to work for non-PIC code, we >>> should look to code it in the same place surely by making all >>> __attribute__((noplt)) calls, indirect calls irrespective of whether >>> it's fpic or not. >>> >>> >>>>> >>>>> You're missing my point, unless I'm missing something basic here - I >>>>> should have been even more explicit and said -fPIC was a given in all >>>>> this discussion. >>>>> >>>>> calls.c:229 has >>>>> >>>>> else if (flag_pic && !flag_plt && fndecl_or_type >>>>> && TREE_CODE (fndecl_or_type) == FUNCTION_DECL >>>>> && !targetm.binds_local_p (fndecl_or_type)) >>>>> >>>>> why can't we merge the check in here for the attribute noplt ? >>>> >>>> We can and and please see this thread, that is the exact patch I proposed : >>>> https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02682.html >>>> >>>> However, there was one caveat. I want this working without -fPIC too. >>>> non-PIC code also generates PLT calls and I want them eliminated. >>>> >>>>> >>>>> If a new attribute is added to the "GNU language" in this case, why >>>>> isn't this being treated in the same way as the command line option >>>>> has been treated ? All this means is that we add an attribute and a >>>>> command line option to common code and then not implement it in a >>>>> proper target agnostic fashion. >>>> >>>> You are right. This is the way I wanted it too but I also wanted the >>>> attribute to work without PIC. PLT calls are generated without -fPIC >>>> and -fPIE too and I wanted a solution for that. On looking at the >>>> code in more detail, >>>> >>>> * -fno-plt is made to work with -fPIC, is there a reason to not make >>>> it work for non-PIC code? I can remove the flag_pic check from >>>> calls.c >>> >>> I don't think that's right, you probably have to allow that along with >>> (flag_pic || (decl && attribute_no_plt (decl)) - however it seems odd >>> to me that the language extension allows this but the flag doesn't. >>> >>>> * Then, I add the generic attribute "noplt" and everything is fine. >>>> >>>> There is just one caveat with the above approach, for x86_64 >>>> (*call_insn) will not generate indirect-calls for *non-PIC* code >>>> because constant_call_address_operand in predicates.md will evaluate >>>> to false. This can be fixed appropriately in ix86_output_call_insn in >>>> i386.c. >>> >>> Yes, targets need to massage that into place but that's essentially >>> the mechanics of retaining indirect calls in each backend. -fno-plt >>> doesn't work for ARM / AArch64 with optimizers currently (and I >>> suspect on most other targets) because our predicates are too liberal, >>> fixed by treating "noplt" or -fno-plt as the equivalent of >>> -mlong-calls. >>> >>>> >>>> >>>> Is this alright? Sorry for the confusion, but the primary reason why >>>> I did not do it the way you suggested is because we wanted "noplt" >>>> attribute to work for non-PIC code also. >>> >>> If that is the case, then this is a slightly more complicated >>> condition in the same place. We then always have indirect calls for >>> functions that are marked noplt and just have target generate this >>> appropriately. >> >> I have now modified this patch. > > Thanks for taking care of this. I'll have a read through tomorrow > morning when I'm at my normal work machine. > >> >> This patch does two things: >> >> 1) Adds new generic function attribute "no_plt" that is similar in >> functionality to -fno-plt except that it applies only to calls to >> functions that are marked with this attribute. >> 2) For x86_64, it makes -fno-plt(and the attribute) also work for >> non-PIC code by directly generating an indirect call via a GOT entry. > > I'm sorry I'm going to push back again for the same reason. > > Other than forcing targets to tweak their call insn patterns, the act > of generating the indirect call should remain in target independent > code. Sorry, not having the same behaviour on all platforms for > something like this is just a recipe for confusion. Do you have a good suggestion on the way to implement this (non PIC no-plt) in a clean and target independent way? Regarding the 'confusion' part, is it a matter of documentation (can be updated when more targets start to support it more efficiently)? David > > regards > Ramana > >> >> For PIC code, no_plt merely shadows the implementation of -fno-plt, no >> surprises here. >> >> * c-family/c-common.c (no_plt): New attribute. >> (handle_no_plt_attribute): New handler. >> * calls.c (prepare_call_address): Check for no_plt >> attribute. >> * config/i386/i386.c (ix86_function_ok_for_sibcall): Check >> for no_plt attribute. >> (ix86_expand_call): Ditto. >> (nopic_no_plt_attribute): New function. >> (ix86_output_call_insn): Output indirect call for non-pic >> no plt calls. >> * doc/extend.texi (no_plt): Document new attribute. >> * testsuite/gcc.target/i386/noplt-1.c: New test. >> * testsuite/gcc.target/i386/noplt-2.c: New test. >> * testsuite/gcc.target/i386/noplt-3.c: New test. >> * testsuite/gcc.target/i386/noplt-4.c: New test. >> >> >> Please review. >> >> Thanks >> Sri >> >> >>> >>> To be honest, this is trivial to implement in the ARM backend as one >>> would just piggy back on the longcalls work - despite that, IMNSHO >>> it's best done in a target independent manner. >>> >>> regards >>> Ramana >>> >>>> >>>> Thanks >>>> Sri >>>> >>>>> >>>>> regards >>>>> Ramana >>>>> >>>>> >>>>>> >>>>>>> >>>>>>>> I am not familiar with PLT calls for other targets. I can move the >>>>>>>> tests to gcc.dg but what relocation are you suggesting I check for? >>>>>>> >>>>>>> >>>>>>> Move the test to gcc.dg, add a target_support_no_plt function in >>>>>>> testsuite/lib/target-supports.exp and mark this as being supported only on >>>>>>> x86 and use scan-assembler to scan for PLT relocations for x86. Other >>>>>>> targets can add things as they deem fit. >>>>>> >>>>>>> >>>>>>> In any case, on a large number of elf/ linux targets I would have thought >>>>>>> the absence of a JMP_SLOT relocation would be good enough to check that this >>>>>>> is working correctly. >>>>>>> >>>>>>> regards >>>>>>> Ramana >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> Sri >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Ramana >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Also I think the PLT calls have EBX in call fusage wich is added by >>>>>>>>>>> ix86_expand_call. >>>>>>>>>>> else >>>>>>>>>>> { >>>>>>>>>>> /* Static functions and indirect calls don't need the pic >>>>>>>>>>> register. */ >>>>>>>>>>> if (flag_pic >>>>>>>>>>> && (!TARGET_64BIT >>>>>>>>>>> || (ix86_cmodel == CM_LARGE_PIC >>>>>>>>>>> && DEFAULT_ABI != MS_ABI)) >>>>>>>>>>> && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF >>>>>>>>>>> && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0))) >>>>>>>>>>> { >>>>>>>>>>> use_reg (&use, gen_rtx_REG (Pmode, >>>>>>>>>>> REAL_PIC_OFFSET_TABLE_REGNUM)); >>>>>>>>>>> if (ix86_use_pseudo_pic_reg ()) >>>>>>>>>>> emit_move_insn (gen_rtx_REG (Pmode, >>>>>>>>>>> REAL_PIC_OFFSET_TABLE_REGNUM), >>>>>>>>>>> pic_offset_table_rtx); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> I think you want to take that away from FUSAGE there just like we do >>>>>>>>>>> for >>>>>>>>>>> local calls >>>>>>>>>>> (and in fact the code should already check flag_pic && flag_plt I >>>>>>>>>>> suppose. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Done that now and patch attached. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Sri >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Honza >>>>>>>> >>>>>>>> >>>>>>>