On Thu, Dec 17, 2015 at 1:59 PM, H.J. Lu wrote: > On Thu, Dec 17, 2015 at 1:21 PM, Uros Bizjak wrote: >> On Thu, Dec 17, 2015 at 7:09 PM, H.J. Lu wrote: >>> On Thu, Dec 17, 2015 at 8:11 AM, H.J. Lu wrote: >>>> On Thu, Dec 17, 2015 at 7:50 AM, H.J. Lu wrote: >>>>> On Thu, Dec 17, 2015 at 5:42 AM, Uros Bizjak wrote: >>>>>> On Thu, Dec 17, 2015 at 2:00 PM, H.J. Lu wrote: >>>>>>> On Thu, Dec 17, 2015 at 2:04 AM, Uros Bizjak wrote: >>>>>>>> On Thu, Dec 17, 2015 at 12:29 AM, H.J. Lu wrote: >>>>>>>>> Since sibcall never returns, we can only use call-clobbered register >>>>>>>>> as GOT base. Otherwise, callee-saved register used as GOT base won't >>>>>>>>> be properly restored. >>>>>>>>> >>>>>>>>> Tested on x86-64 with -m32. OK for trunk? >>>>>>>> >>>>>>>> You don't have to add explicit clobber for members of "CLOBBERED_REGS" >>>>>>>> class, and register_no_elim_operand predicate should be used with "U" >>>>>>>> constraint. Also, please introduce new predicate, similar to how >>>>>>>> GOT_memory_operand is defined and handled. >>>>>>>> >>>>>>> >>>>>>> Here is the updated patch. There is a predicate already, >>>>>>> sibcall_memory_operand. It allows any registers to >>>>>>> be as GOT base, which is the root of our problem. >>>>>>> This patch removes GOT slot from it and handles >>>>>>> sibcall over GOT slot with *sibcall_GOT_32 and >>>>>>> *sibcall_value_GOT_32 patterns. Since I need to >>>>>>> expose constraints on GOT base register to RA, >>>>>>> I have to use 2 operands, GOT base and function >>>>>>> symbol, to describe sibcall over 32-bit GOT slot. >>>>>> >>>>>> Please use >>>>>> >>>>>> (mem:SI (plus:SI >>>>>> (match_operand:SI 0 "register_no_elim_operand" "U") >>>>>> (match_operand:SI 1 "GOT32_symbol_operand"))) >>>>>> ... >>>>>> >>>>>> to avoid manual rebuild of the operand. >>>>>> >>>>> >>>>> Is this OK? >>>>> >>>> >>>> An updated patch to allow sibcall_memory_operand for RTL >>>> expansion. OK for trunk if there is no regression? >>>> >>> >>> There is no regressions on x86-64 with -m32. OK for trunk? >> >> OK for mainline, with a following change: >> >> @@ -597,11 +597,17 @@ >> (match_operand 0 "memory_operand")))) >> >> ;; Return true if OP is a memory operands that can be used in sibcalls. >> +;; Since sibcall never returns, we can only use call-clobbered register >> +;; as GOT base. Allow GOT slot here only with pseudo register as GOT >> +;; base. Properly handle sibcall over GOT slot with *sibcall_GOT_32 >> +;; and *sibcall_value_GOT_32 patterns. >> (define_predicate "sibcall_memory_operand" >> (and (match_operand 0 "memory_operand") >> (match_test "CONSTANT_P (XEXP (op, 0)) >> || (GET_CODE (XEXP (op, 0)) == PLUS >> && REG_P (XEXP (XEXP (op, 0), 0)) >> + && (REGNO (XEXP (XEXP (op, 0), 0)) >> + >= FIRST_PSEUDO_REGISTER) >> && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST >> && GET_CODE (XEXP (XEXP (XEXP (op, 0), 1), 0)) == UNSPEC >> && XINT (XEXP (XEXP (XEXP (op, 0), 1), 0), 1) == UNSPEC_GOT)"))) >> >> You can use (!HARD_REGISTER_NUM_P (...) || call_used_regs[...]) here. >> Call-used hard regs are still allowed here. >> >> Can you please also rewrite this horrible match_test as a block of C >> code using GOT32_symbol_operand predicate? >> > > I am retesting the patch with > > ;; Return true if OP is a memory operands that can be used in sibcalls. > ;; Since sibcall never returns, we can only use call-clobbered register > ;; as GOT base. Allow GOT slot here only with pseudo register as GOT > ;; base. Properly handle sibcall over GOT slot with *sibcall_GOT_32 > ;; and *sibcall_value_GOT_32 patterns. > (define_predicate "sibcall_memory_operand" > (match_operand 0 "memory_operand") > { > op = XEXP (op, 0); > if (CONSTANT_P (op)) > return true; > if (GET_CODE (op) == PLUS && REG_P (XEXP (op, 0))) > { > int regno = REGNO (XEXP (op, 0)); > if (!HARD_REGISTER_NUM_P (regno) || call_used_regs[regno]) > { > op = XEXP (op, 1); > if (GOT32_symbol_operand (op, VOIDmode)) > return true; > } > } > return false; > }) > > > I will check it in if there is no regression. > There is no regression. But I missed sibcall to local function with -O2 -fPIC -m32 -fno-plt -mregparm=3: extern void bar (int, int, int) __attribute__((visibility("hidden"))); void foo (int a, int b, int c) { bar (a, b, c); bar (a, b, c); } It doesn't need GOT. This patch fixes it. iff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 0e2bec3..691915f9 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -6657,6 +6657,7 @@ ix86_function_ok_for_sibcall (tree decl, tree exp) { tree type, decl_or_type; rtx a, b; + bool bind_global = decl && !targetm.binds_local_p (decl); /* If we are generating position-independent code, we cannot sibcall optimize direct calls to global functions, as the PLT requires @@ -6665,7 +6666,7 @@ ix86_function_ok_for_sibcall (tree decl, tree exp) && !TARGET_64BIT && flag_pic && flag_plt - && decl && !targetm.binds_local_p (decl)) + && bind_global) return false; /* If we need to align the outgoing stack, then sibcalling would @@ -6726,7 +6727,7 @@ ix86_function_ok_for_sibcall (tree decl, tree exp) parameters. Note that DLLIMPORT functions and call via GOT slot are indirect. */ if (!decl - || (flag_pic && !flag_plt) + || (bind_global && flag_pic && !flag_plt) || (TARGET_DLLIMPORT_DECL_ATTRIBUTES && DECL_DLLIMPORT_P (decl))) { /* Check if regparm >= 3 since arg_reg_available is set to Here is the complete patch with a testcase. OK for trunk? Thanks. -- H.J.