From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24468 invoked by alias); 19 Apr 2011 12:56:20 -0000 Received: (qmail 24456 invoked by uid 22791); 19 Apr 2011 12:56:19 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,TW_QE X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (94.185.240.25) by sourceware.org (qpsmtpd/0.43rc1) with SMTP; Tue, 19 Apr 2011 12:56:04 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Tue, 19 Apr 2011 13:56:01 +0100 Received: from [10.1.67.34] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Tue, 19 Apr 2011 13:55:57 +0100 Subject: Re: [google] remove redundant push {lr} for -mthumb (issue4441050) From: Richard Earnshaw To: Guozhi Wei Cc: reply@codereview.appspotmail.com, dougkwan@google.com, gcc-patches@gcc.gnu.org In-Reply-To: <20110419094104.B5D2A20562@guozhiwei.sha.corp.google.com> References: <20110419094104.B5D2A20562@guozhiwei.sha.corp.google.com> Date: Tue, 19 Apr 2011 13:26:00 -0000 Message-Id: <1303217757.17819.66.camel@e102346-lin.cambridge.arm.com> Mime-Version: 1.0 X-MC-Unique: 111041913560102001 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-04/txt/msg01524.txt.bz2 On Tue, 2011-04-19 at 17:41 +0800, Guozhi Wei wrote: > Reload pass tries to determine the stack frame, so it needs to check the > push/pop lr optimization opportunity. One of the criteria is if there is = any > far jump inside the function. Unfortunately at this time gcc can't decide= each > instruction's length and basic block layout, so it can't know the offset = of > a jump. To be conservative it assumes every jump is a far jump. So any ju= mp > in a function will prevent this push/pop lr optimization. >=20 > To enable the push/pop lr optimization in reload pass, I compute the poss= ible > maximum length of the function body. If the length is not large enough, f= ar > jump is not necessary, so we can safely do push/pop lr optimization. >=20 > Tested on arm qemu with options -march=3Darmv5te -mthumb, without regress= ion. >=20 > This patch is for google/main. >=20 > 2011-04-19 Guozhi Wei >=20 > Google ref 40255. > * gcc/config/arm/arm.c (SHORTEST_FAR_JUMP_LENGTH): New constant. > (estimate_function_length): New function. > (thumb_far_jump_used_p): No far jump is needed in short function. >=20 Setting aside for the moment Richi's issue with hot/cold sections, this isn't safe. Firstly get_attr_length() doesn't return the worst case length; and secondly, it doesn't take into account the size of reload insns that are still on the reloads stack -- these are only emitted right at the end of the reload pass. Both of these would need to be addressed before this can be safely done. It's worth noting here that in the dim and distant past we used to try to estimate the size of the function and eliminate redundant saves of R14, but the code had to be removed because it was too fragile; but it looks like some vestiges of the code are still in the compiler. A slightly less optimistic approach, but one that is much safer is to scan the function after reload has completed and see if we can avoid having to push LR. We can do this if: - The function makes no calls - The function saves nothing on the stack other than r14 - It's small enough (by this point we can use get_attr_length) - R14 is only modified by internal jump instructions There's already some code to try to do this in the ARM back-end (look for lr_save_eliminated), but it's probably not doing its job properly because it tries to cache the result early on (it's costly to work this out) and at the time its first called we cannot assert that the register won't be live. R.