From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 397 invoked by alias); 12 May 2005 08:37:02 -0000 Mailing-List: contact binutils-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sources.redhat.com Received: (qmail 310 invoked from network); 12 May 2005 08:36:52 -0000 Received: from unknown (HELO omta05sl.mx.bigpond.com) (144.140.93.195) by sourceware.org with SMTP; 12 May 2005 08:36:52 -0000 Received: from grove.modra.org ([144.136.167.90]) by omta05sl.mx.bigpond.com with ESMTP id <20050512083650.JOYE24998.omta05sl.mx.bigpond.com@grove.modra.org>; Thu, 12 May 2005 08:36:50 +0000 Received: by bubble.grove.modra.org (Postfix, from userid 500) id C6BBF1AB658; Thu, 12 May 2005 18:06:50 +0930 (CST) Date: Thu, 12 May 2005 09:02:00 -0000 From: Alan Modra To: Richard Henderson Cc: binutils@sources.redhat.com, Steve Munroe , Anton Blanchard , Paul Mackerras Subject: Re: powerpc new PLT and GOT Message-ID: <20050512083650.GH29302@bubble.grove.modra.org> Mail-Followup-To: Richard Henderson , binutils@sources.redhat.com, Steve Munroe , Anton Blanchard , Paul Mackerras References: <20050511141249.GA29302@bubble.grove.modra.org> <20050512053747.GA5254@redhat.com> <20050512060814.GG29302@bubble.grove.modra.org> <20050512063729.GA5299@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050512063729.GA5299@redhat.com> User-Agent: Mutt/1.4i X-SW-Source: 2005-05/txt/msg00418.txt.bz2 On Wed, May 11, 2005 at 11:37:29PM -0700, Richard Henderson wrote: > On Thu, May 12, 2005 at 03:38:14PM +0930, Alan Modra wrote: > > Not if the plt call stubs are modified to copy their got pointer to, > > say, r12. Or better, if PLTresolve loads its own got pointer, like > > this: > > Or better, ensure that r0 (or r12 if r0 can't be used in the appropriate > addressing modes) still contains a copy of ctr, which means that it > contains a copy of PLTresolve. Then use pc-relative references to your > got entries instead of got-relative. Yes, that would be even nicer. Thanks for checking over the ABI proposal. Hmm, your idea about ctr triggered some more ideas.. The one thing that I'm a little unhappy about with the new plt call scheme is that # ith PLT code stub. addis 11,30,(plt+(i-1)*4-got)@ha lwzu 0,(plt+(i-1)*4-got)@l(11) mtctr 0 bctr is slower than the old plt call scheme, which allowed ld.so to optimise .plt to simple branches. Steve Munroe improved it a little by suggesting that when plt and got are close enough we could reduce it to lwz 0,(plt+(i-1)*4-got)(30) mtctr 0 bctr but that loses r11 as an index into the plt. So each plt call stub needs a different entry into PLTresolve in order to differentiate plt entries. Steve suggested that each entry would load r11, using "li 11,(i-1)*4; b PLTresolve" as is done with the PowerPC64 .glink. Combining Steve's idea with yours about ctr gets me to # ith PLT code stub. addis 11,30,(plt+(i-1)*4-got)@ha lwz 11,(plt+(i-1)*4-got)@l(11) mtctr 11 bctr # or, if plt+(i-1)*4-got is less than 32k lwz 11,(plt+(i-1)*4-got)(30) mtctr 11 bctr # A table of branches, one for each plt entry. # The idea is that the plt call stub loads ctr (and r11) with these # addresses, so (r11 - res_0) gives the plt index * 4. res_0: b PLTresolve res_1: b PLTresolve . # Some number of entries towards the end can be nops res_n_m3: nop res_n_m2: nop res_n_m1: PLTresolve: mflr 0 bcl 20,31,1f 1: mflr 12 addis 11,11,(1b-res_0)@ha addi 11,11,(1b-res_0)@l sub 11,11,12 # r11 = index * 4 addis 12,12,(got-1b)@ha addi 12,12,(got-1b)@l # r12 = _GLOBAL_OFFSET_TABLE_ mtlr 0 add 0,11,11 add 11,0,11 # r11 = index * 12 = reloc offset. lwz 0,4(12) # got[1] address of dl_runtime_resolve mtctr 0 lwz 12,8(12) # got[2] contains the map address bctr Of course, if we want to make the normal plt call path go fast, then the thing to do is have gcc generate the plt call stubs so that they can be scheduled. So gcc generates addis 11,30,foo@gotplt@ha lwz 11,foo@gotplt@l(11) mtctr 11,foo@gotplt_marker bctr foo@gotplt_marker hopefully with other instructions scheduled in the sequence. The funny looking gotplt_marker relocs are because ld might resolve "foo" to a local function, and would then turn the sequence into nop nop nop bl foo -- Alan Modra IBM OzLabs - Linux Technology Centre