From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10756 invoked by alias); 18 Dec 2010 10:09:34 -0000 Received: (qmail 10748 invoked by uid 22791); 18 Dec 2010 10:09:33 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,TW_EQ X-Spam-Check-By: sourceware.org Received: from mail-wy0-f169.google.com (HELO mail-wy0-f169.google.com) (74.125.82.169) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 18 Dec 2010 10:09:27 +0000 Received: by wyj26 with SMTP id 26so1556269wyj.0 for ; Sat, 18 Dec 2010 02:09:25 -0800 (PST) Received: by 10.216.181.199 with SMTP id l49mr4903758wem.68.1292666965151; Sat, 18 Dec 2010 02:09:25 -0800 (PST) Received: from localhost (rsandifo.gotadsl.co.uk [82.133.89.107]) by mx.google.com with ESMTPS id m10sm976470wbc.4.2010.12.18.02.09.22 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 18 Dec 2010 02:09:24 -0800 (PST) From: Richard Sandiford To: "Maciej W. Rozycki" Mail-Followup-To: "Maciej W. Rozycki" ,Ilie Garbacea , Joseph Myers , binutils@sourceware.org, Chao-ying Fu , Rich Fuhler , David Lau , Kevin Mills , Catherine Moore , Nathan Sidwell , Nathan Froyd , rdsandiford@googlemail.com Cc: Ilie Garbacea , Joseph Myers , binutils@sourceware.org, Chao-ying Fu , Rich Fuhler , David Lau , Kevin Mills , Catherine Moore , Nathan Sidwell , Nathan Froyd Subject: Re: [PATCH] MIPS: microMIPS ASE support References: <87y6fa9u3t.fsf@firetop.home> <876302kqvu.fsf@firetop.home> <871v5n9m7e.fsf@firetop.home> Date: Sat, 18 Dec 2010 10:26:00 -0000 In-Reply-To: (Maciej W. Rozycki's message of "Thu, 16 Dec 2010 11:16:23 +0000 (GMT)") Message-ID: <87wrn79yum.fsf@firetop.home> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org X-SW-Source: 2010-12/txt/msg00537.txt.bz2 "Maciej W. Rozycki" writes: >> > Only link variations of branches and jumps have a fixed-size delay slot >> > -- that's because the link register is set to a fixed offset from the >> > delay-slot instruction (either four as with JAL or two as with JALS). Of >> > all such jumps and branches only JALX does not have a JALXS counterpart >> > (regrettably, as it would have made life of software much, much easier). >> > >> > I've explained the meaning of 0 below -- it's unsafe to return this value >> > for a variable-size delay slot. >> >> Hmm, I was thinking of the case where there was no branch _after_ >> the LUI, and where the instruction after the LUI could then become >> the delay slot for a variable-length branch before the (deleted) LUI. >> But yeah, I can see that 0 isn't correct if there is a branch immediately >> after the LUI. > > Well, if we have code like this: > > branch ... > LUI ... > insn [...] > > (where for the purpose of this consideration BRANCH may also be a jump) > then LUI cannot be entirely deleted and INSN moved into the slot of BRANCH > no matter if INSN is a branch or an computational instruction. All we can > do in this case is to see if there is a corresponding BRANCHC instruction > and use it to swap BRANCH with and then delete the LUI if so, or otherwise > shrink the LUI to a 16-bit NOP if BRANCH permits or can be swapped with > BRANCHS to permit a 16-bit delay-slot instruction. If neither is > possible, then the LUI is merely substituted with a 32-bit NOP (although > the effect is purely cosmetical in this case; perhaps we should just back > out). Yeah, I see your point. I was thinking that the code claims to "know" that the LUI and "insn" are both part of the same load address. So if the branch was taken, the target of the LUI ought to be dead. However, I agree that (even though the code does seem to assume that to some extent) the assumption is wrong. E.g. you could have: beqz $2,1f lui $4,%hi(foo) <-- A addiu $4,$4,%lo(foo) <-- B ... jr $31 2: ... lui $4,%hi(foo) <-- C ... 1: addiu $4,$4,%lo(foo) <-- D In this case, the LO16 reloc for D might follow the HI16 reloc for C, and the LO16 reloc for B might follow the HI16 reloc for A. AIUI, we'd consider relaxing A/B but not C/D. In this case, turning A into a NOP is wrong, because $4 is still live at D. If you agree then... > Also with the recent update to LUI relaxation code I think we should > simply disallow the optimisation if a LUI is in a delay slot of an > unconditional branch -- we have no way to verify the corresponding LO16 > reloc really belongs to this LUI instruction in that case. This will let > us simplify code (which has become a little bit hairy by now IMO) a little > bit I would guess. [FIXME] ...maybe it would be simpler to drop the optimisation if the LUI is any kind of delay slot. I think this would simply the code, and I don't think we'd then need to check for branch relocs. We'd just have *_norel-like functions (although not called that any more) to check for every kind of branch. I obviously had a bit of a mental block when reviewing this delay slot stuff, sorry. Richard