From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7079 invoked by alias); 11 Nov 2015 14:56:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 7039 invoked by uid 89); 11 Nov 2015 14:56:48 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.6 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RDNS_DYNAMIC,TVD_RCVD_IP autolearn=no version=3.3.2 X-HELO: brightrain.aerifal.cx Received: from 216-12-86-13.cv.mvl.ntelos.net (HELO brightrain.aerifal.cx) (216.12.86.13) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Nov 2015 14:56:46 +0000 Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1ZwWpG-0001vz-00; Wed, 11 Nov 2015 14:56:42 +0000 Date: Wed, 11 Nov 2015 14:56:00 -0000 From: Rich Felker To: Oleg Endo Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH v4] SH FDPIC backend support Message-ID: <20151111145642.GV3818@brightrain.aerifal.cx> References: <20151021034138.GA4087@brightrain.aerifal.cx> <1445433471.5521.100.camel@t-online.de> <20151021201510.GV8645@brightrain.aerifal.cx> <20151023063221.GI8645@brightrain.aerifal.cx> <1445783331.8060.3.camel@t-online.de> <20151027024706.GU8645@brightrain.aerifal.cx> <1445954499.8060.22.camel@t-online.de> <20151110200700.GT3818@brightrain.aerifal.cx> <1447252586.3080.9.camel@t-online.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1447252586.3080.9.camel@t-online.de> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes X-SW-Source: 2015-11/txt/msg01385.txt.bz2 On Wed, Nov 11, 2015 at 11:36:26PM +0900, Oleg Endo wrote: > On Tue, 2015-11-10 at 15:07 -0500, Rich Felker wrote: > > > > The way libcalls are now emitted is a bit unhandy. If more special > > > -ABI > > > libcalls are to be added in the future, they all have to do the jsr > > > vs. > > > bsrf handling (some potential candidates for new libcalls are > > > optimized > > > soft FP routines). Then we still have PR 65374 and PR 54019. In > > > the > > > future maybe we should come up with something that allows emitting > > > libcalls in a more transparent way... > > > > I'd like to look into improving this at some point in the near > > future. > > On further reading of the changes made, I think there's a lot of code > > we could reduce or simplify. > > > > In all the places where new RTL patterns were added for *call*_fdpic, > > the main constraint change vs the non-fdpic version is using REG_PIC. > > Is it possible to make a REG_GOT_ARG macro or similar that's defined > > as something like TARGET_FDPIC ? REG_PIC : nonexistent_or_dummy? > > I'm not sure I understand what you mean by that. Do you have a small > code snippet example? Sorry, I don't really understand RTL well enough to make a code snippet. What I want to express is that an insn "uses" (in the (use ...) sense) a register (r12) conditionally depending on a runtime option (TARGET_FDPIC). > > As for the call site stuff, I wonder why the existing call site stuff > > used by "call_pcrel" can't be used for SFUNC_STATIC. > > "call_pcrel" is a real call insn. The libcalls are not expanded as > real call insns to avoid the regular register save/restores etc which > is needed to do a normal function call. Yes, I see that. What I was really wondering though is why the new call site generation code and constraint was added when the call_pcrel code already has mechanisms for this, rather than just duplicating the internals that call_pcrel uses. It seems like we're doing things in a gratuitously different way here. > I guess the generic fix for this issue would be some mechanism to > specify which regs are clobbered/preserved and then provide the right > settings for the libcall functions. Is this possible in the sh backend or does it need changes to higher-level gcc code? (i.e. is it presently possible to make an insn that conditionally clobbers different things rather than having to make tons of different insns for each possible set of clobbers?) > > I'm actually > > trying to prepare a simpler FDPIC patch for other gcc versions we're > > interested in that's not so invasive, and for now I'm just having > > function_symbol replace SFUNC_STATIC with SFUNC_GOT on TARGET_FDPIC > > to > > avoid needing all the label stuff, but it would be nice to find a way > > to reuse the existing framework. > > Do you know how this affects code size (and inherently performance)? I suspect it makes very little difference, but to compare I'd need to do the same hack on 5.2.0 or trunk. The only difference should be one additional load per call, and one additional GOT slot per function called this way (but just once per executable/library). Another issue I've started looking at is how r12 is put in fixed_regs, which is conceptually wrong. Preliminary tests show that removing it from fixed_regs doesn't break and produces much better code -- r12 gets used as a temp register in functions that don't need it, and in one function that made multiple calls, the saving of initial r12 to a call-saved register even happened in the delay slot of the call. I've been discussing it with Alexander Monakov on IRC (#musl) and based on my understanding so far of how gcc works (which admittedly may be wrong) the current FDPIC code looks like it's written not to depend on r12 being 'fixed'. Also I think I'm pretty close to understanding how we could make the same improvements for non-FDPIC PIC codegen: instead of loading r12 in the prologue, load a pseudo, then use that pseudo for GOT access and force it into r12 the same way FDPIC call code does for PLT calls. Does this sound correct? Rich