From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 99639 invoked by alias); 19 Dec 2015 16:19:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 99619 invoked by uid 89); 19 Dec 2015 16:19:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=ubizjak@gmail.com, ubizjakgmailcom, his, our X-HELO: mail-oi0-f54.google.com Received: from mail-oi0-f54.google.com (HELO mail-oi0-f54.google.com) (209.85.218.54) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Sat, 19 Dec 2015 16:19:03 +0000 Received: by mail-oi0-f54.google.com with SMTP id y66so74013677oig.0 for ; Sat, 19 Dec 2015 08:19:03 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.202.75.205 with SMTP id y196mr1107952oia.130.1450541941459; Sat, 19 Dec 2015 08:19:01 -0800 (PST) Received: by 10.60.138.70 with HTTP; Sat, 19 Dec 2015 08:19:01 -0800 (PST) In-Reply-To: References: <20151216232951.GA17976@intel.com> Date: Sat, 19 Dec 2015 16:19:00 -0000 Message-ID: Subject: Re: [PATCH] PR target/68937: i686: -fno-plt produces wrong code (maybe only with tailcall From: Uros Bizjak To: "H.J. Lu" Cc: "gcc-patches@gcc.gnu.org" , Jakub Jelinek Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2015-12/txt/msg01945.txt.bz2 On Fri, Dec 18, 2015 at 1:55 AM, H.J. Lu wrote: > On Thu, Dec 17, 2015 at 1:59 PM, H.J. Lu wrote: >> On Thu, Dec 17, 2015 at 1:21 PM, Uros Bizjak wrote: >>> On Thu, Dec 17, 2015 at 7:09 PM, H.J. Lu wrote: >>>> On Thu, Dec 17, 2015 at 8:11 AM, H.J. Lu wrote: >>>>> On Thu, Dec 17, 2015 at 7:50 AM, H.J. Lu wrote: >>>>>> On Thu, Dec 17, 2015 at 5:42 AM, Uros Bizjak wrote: >>>>>>> On Thu, Dec 17, 2015 at 2:00 PM, H.J. Lu wrote: >>>>>>>> On Thu, Dec 17, 2015 at 2:04 AM, Uros Bizjak wrote: >>>>>>>>> On Thu, Dec 17, 2015 at 12:29 AM, H.J. Lu wrote: >>>>>>>>>> Since sibcall never returns, we can only use call-clobbered register >>>>>>>>>> as GOT base. Otherwise, callee-saved register used as GOT base won't >>>>>>>>>> be properly restored. >>>>>>>>>> >>>>>>>>>> Tested on x86-64 with -m32. OK for trunk? >>>>>>>>> >>>>>>>>> You don't have to add explicit clobber for members of "CLOBBERED_REGS" >>>>>>>>> class, and register_no_elim_operand predicate should be used with "U" >>>>>>>>> constraint. Also, please introduce new predicate, similar to how >>>>>>>>> GOT_memory_operand is defined and handled. >>>>>>>>> >>>>>>>> >>>>>>>> Here is the updated patch. There is a predicate already, >>>>>>>> sibcall_memory_operand. It allows any registers to >>>>>>>> be as GOT base, which is the root of our problem. >>>>>>>> This patch removes GOT slot from it and handles >>>>>>>> sibcall over GOT slot with *sibcall_GOT_32 and >>>>>>>> *sibcall_value_GOT_32 patterns. Since I need to >>>>>>>> expose constraints on GOT base register to RA, >>>>>>>> I have to use 2 operands, GOT base and function >>>>>>>> symbol, to describe sibcall over 32-bit GOT slot. >>>>>>> >>>>>>> Please use >>>>>>> >>>>>>> (mem:SI (plus:SI >>>>>>> (match_operand:SI 0 "register_no_elim_operand" "U") >>>>>>> (match_operand:SI 1 "GOT32_symbol_operand"))) >>>>>>> ... >>>>>>> >>>>>>> to avoid manual rebuild of the operand. >>>>>>> >>>>>> >>>>>> Is this OK? >>>>>> >>>>> >>>>> An updated patch to allow sibcall_memory_operand for RTL >>>>> expansion. OK for trunk if there is no regression? >>>>> >>>> >>>> There is no regressions on x86-64 with -m32. OK for trunk? >>> >>> OK for mainline, with a following change: >>> >>> @@ -597,11 +597,17 @@ >>> (match_operand 0 "memory_operand")))) >>> >>> ;; Return true if OP is a memory operands that can be used in sibcalls. >>> +;; Since sibcall never returns, we can only use call-clobbered register >>> +;; as GOT base. Allow GOT slot here only with pseudo register as GOT >>> +;; base. Properly handle sibcall over GOT slot with *sibcall_GOT_32 >>> +;; and *sibcall_value_GOT_32 patterns. >>> (define_predicate "sibcall_memory_operand" >>> (and (match_operand 0 "memory_operand") >>> (match_test "CONSTANT_P (XEXP (op, 0)) >>> || (GET_CODE (XEXP (op, 0)) == PLUS >>> && REG_P (XEXP (XEXP (op, 0), 0)) >>> + && (REGNO (XEXP (XEXP (op, 0), 0)) >>> + >= FIRST_PSEUDO_REGISTER) >>> && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST >>> && GET_CODE (XEXP (XEXP (XEXP (op, 0), 1), 0)) == UNSPEC >>> && XINT (XEXP (XEXP (XEXP (op, 0), 1), 0), 1) == UNSPEC_GOT)"))) >>> >>> You can use (!HARD_REGISTER_NUM_P (...) || call_used_regs[...]) here. >>> Call-used hard regs are still allowed here. >>> >>> Can you please also rewrite this horrible match_test as a block of C >>> code using GOT32_symbol_operand predicate? >>> >> >> I am retesting the patch with >> >> ;; Return true if OP is a memory operands that can be used in sibcalls. >> ;; Since sibcall never returns, we can only use call-clobbered register >> ;; as GOT base. Allow GOT slot here only with pseudo register as GOT >> ;; base. Properly handle sibcall over GOT slot with *sibcall_GOT_32 >> ;; and *sibcall_value_GOT_32 patterns. >> (define_predicate "sibcall_memory_operand" >> (match_operand 0 "memory_operand") >> { >> op = XEXP (op, 0); >> if (CONSTANT_P (op)) >> return true; >> if (GET_CODE (op) == PLUS && REG_P (XEXP (op, 0))) >> { >> int regno = REGNO (XEXP (op, 0)); >> if (!HARD_REGISTER_NUM_P (regno) || call_used_regs[regno]) >> { >> op = XEXP (op, 1); >> if (GOT32_symbol_operand (op, VOIDmode)) >> return true; >> } >> } >> return false; >> }) >> >> >> I will check it in if there is no regression. >> > > There is no regression. But I missed sibcall to local function > with -O2 -fPIC -m32 -fno-plt -mregparm=3: > > extern void bar (int, int, int) __attribute__((visibility("hidden"))); > > void > foo (int a, int b, int c) > { > bar (a, b, c); > bar (a, b, c); > } > > It doesn't need GOT. This patch fixes it. > > iff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 0e2bec3..691915f9 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -6657,6 +6657,7 @@ ix86_function_ok_for_sibcall (tree decl, tree exp) > { > tree type, decl_or_type; > rtx a, b; > + bool bind_global = decl && !targetm.binds_local_p (decl); > > /* If we are generating position-independent code, we cannot sibcall > optimize direct calls to global functions, as the PLT requires > @@ -6665,7 +6666,7 @@ ix86_function_ok_for_sibcall (tree decl, tree exp) > && !TARGET_64BIT > && flag_pic > && flag_plt > - && decl && !targetm.binds_local_p (decl)) > + && bind_global) > return false; > > /* If we need to align the outgoing stack, then sibcalling would > @@ -6726,7 +6727,7 @@ ix86_function_ok_for_sibcall (tree decl, tree exp) > parameters. Note that DLLIMPORT functions and call via GOT > slot are indirect. */ > if (!decl > - || (flag_pic && !flag_plt) > + || (bind_global && flag_pic && !flag_plt) > || (TARGET_DLLIMPORT_DECL_ATTRIBUTES && DECL_DLLIMPORT_P (decl))) > { > /* Check if regparm >= 3 since arg_reg_available is set to > > Here is the complete patch with a testcase. OK for trunk? LGTM, but please allow Jakub (CC'd) a couple of days for his eventual objection. Otherwise, OK for mainline after this period. Uros.