From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 44478 invoked by alias); 20 Aug 2019 15:56:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 44463 invoked by uid 89); 20 Aug 2019 15:56:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-16.5 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_STOCKGEN,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.1 spammy=sk:legitim X-HELO: mx07-00178001.pphosted.com Received: from mx07-00178001.pphosted.com (HELO mx07-00178001.pphosted.com) (62.209.51.94) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Aug 2019 15:56:35 +0000 Received: from pps.filterd (m0046668.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x7KFpFd4012202; Tue, 20 Aug 2019 17:56:31 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=st.com; h=subject : to : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=STMicroelectronics; bh=McONOCJ9mPNaaP1XivF09hWfqw09IpG+3bK6ysi2He4=; b=ZBf/GlTOuEkY6dcBey47+zlPI1MB8dIdk3KWwas7P2qdxYZrUmDE+m+11BJVns3HllZ2 EolFggfvQr7R5b/F9ihKdBdY84irIxhtWv290A8l1hNMgnJFedZTN51gk/oMBZHVnw/s b5ilI1wISz3PanSnIdi7bTjE7lF59uHtYy9R6G1kT9hs7W54Q6cb2iUH64K+EKlSR0ol u9XBli7RqDuWB5EYzww+dKs0FO/amxU0/7+hiSZb86roN4dZ4wZY123uFfWWIm41p6NI zGMaUG5okTELoLCQQEPYtfEmS+f7/FQcKDIWrIKjNNveJCsS92xvREKYmQlr6nNZq1PJ QQ== Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com with ESMTP id 2ue7211q5d-1 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 20 Aug 2019 17:56:31 +0200 Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 3EFD431; Tue, 20 Aug 2019 15:56:30 +0000 (GMT) Received: from Webmail-eu.st.com (sfhdag5node1.st.com [10.75.127.13]) by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id 0D1C120B5A1; Tue, 20 Aug 2019 17:56:30 +0200 (CEST) Received: from [10.129.178.138] (10.75.127.45) by SFHDAG5NODE1.st.com (10.75.127.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 20 Aug 2019 17:56:29 +0200 Subject: Re: [ARM/FDPIC v5 04/21] [ARM] FDPIC: Add support for FDPIC for arm architecture To: , References: <20190515124006.25840-1-christophe.lyon@st.com> <20190515124006.25840-5-christophe.lyon@st.com> From: Christophe Lyon Message-ID: Date: Tue, 20 Aug 2019 17:13:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-08/txt/msg01431.txt.bz2 On 16/07/2019 13:58, Richard Sandiford wrote: > Christophe Lyon writes: >> The FDPIC register is hard-coded to r9, as defined in the ABI. >> >> We have to disable tailcall optimizations if we don't know if the >> target function is in the same module. If not, we have to set r9 to >> the value associated with the target module. >> >> When generating a symbol address, we have to take into account whether >> it is a pointer to data or to a function, because different >> relocations are needed. >> >> 2019-XX-XX Christophe Lyon >> Mickaël Guêné >> >> * config/arm/arm-c.c (__FDPIC__): Define new pre-processor macro >> in FDPIC mode. >> * config/arm/arm-protos.h (arm_load_function_descriptor): Declare >> new function. >> * config/arm/arm.c (arm_option_override): Define pic register to >> FDPIC_REGNUM. >> (arm_function_ok_for_sibcall): Disable sibcall optimization if we >> have no decl or go through PLT. >> (arm_load_pic_register): Handle TARGET_FDPIC. >> (arm_is_segment_info_known): New function. >> (arm_pic_static_addr): Add support for FDPIC. >> (arm_load_function_descriptor): New function. >> (arm_assemble_integer): Add support for FDPIC. >> * config/arm/arm.h (PIC_OFFSET_TABLE_REG_CALL_CLOBBERED): >> Define. (FDPIC_REGNUM): New define. >> * config/arm/arm.md (call): Add support for FDPIC. >> (call_value): Likewise. >> (*restore_pic_register_after_call): New pattern. >> (untyped_call): Disable if FDPIC. >> (untyped_return): Likewise. >> * config/arm/unspecs.md (UNSPEC_PIC_RESTORE): New. >> >> Change-Id: I8fb1a6b85ace672184013568c5d28fbda2f7fda4 >> >> diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c >> index 6e256ee..34695fa 100644 >> --- a/gcc/config/arm/arm-c.c >> +++ b/gcc/config/arm/arm-c.c >> @@ -203,6 +203,8 @@ arm_cpu_builtins (struct cpp_reader* pfile) >> builtin_define ("__ARM_EABI__"); >> } >> >> + def_or_undef_macro (pfile, "__FDPIC__", TARGET_FDPIC); >> + >> def_or_undef_macro (pfile, "__ARM_ARCH_EXT_IDIV__", TARGET_IDIV); >> def_or_undef_macro (pfile, "__ARM_FEATURE_IDIV", TARGET_IDIV); >> >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h >> index 485bc68..272968a 100644 >> --- a/gcc/config/arm/arm-protos.h >> +++ b/gcc/config/arm/arm-protos.h >> @@ -139,6 +139,7 @@ extern int arm_max_const_double_inline_cost (void); >> extern int arm_const_double_inline_cost (rtx); >> extern bool arm_const_double_by_parts (rtx); >> extern bool arm_const_double_by_immediates (rtx); >> +extern rtx arm_load_function_descriptor (rtx funcdesc); >> extern void arm_emit_call_insn (rtx, rtx, bool); >> bool detect_cmse_nonsecure_call (tree); >> extern const char *output_call (rtx *); >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >> index 45abcd8..d9397b5 100644 >> --- a/gcc/config/arm/arm.c >> +++ b/gcc/config/arm/arm.c >> @@ -3485,6 +3485,15 @@ arm_option_override (void) >> if (flag_pic && TARGET_VXWORKS_RTP) >> arm_pic_register = 9; >> >> + /* If in FDPIC mode then force arm_pic_register to be r9. */ >> + if (TARGET_FDPIC) >> + { >> + arm_pic_register = FDPIC_REGNUM; >> + if (! TARGET_ARM && ! TARGET_THUMB2) >> + sorry ("FDPIC mode is supported on architecture versions that " >> + "support ARM or Thumb-2 only."); >> + } >> + >> if (arm_pic_register_string != NULL) >> { >> int pic_register = decode_reg_name (arm_pic_register_string); > > Isn't this equivalent to rejecting Thumb-1? I think that would be > clearer in both the condition and the error message. > > How does this interact with arm_pic_data_is_text_relative? Are both > values supported? > >> @@ -7295,6 +7304,21 @@ arm_function_ok_for_sibcall (tree decl, tree exp) >> if (cfun->machine->sibcall_blocked) >> return false; >> >> + if (TARGET_FDPIC) >> + { >> + /* In FDPIC, never tailcall something for which we have no decl: >> + the target function could be in a different module, requiring >> + a different FDPIC register value. */ >> + if (decl == NULL) >> + return false; >> + >> + /* Don't tailcall if we go through the PLT since the FDPIC >> + register is then corrupted and we don't restore it after >> + static function calls. */ >> + if (!targetm.binds_local_p (decl)) >> + return false; >> + } >> + >> /* Never tailcall something if we are generating code for Thumb-1. */ >> if (TARGET_THUMB1) >> return false; >> @@ -7711,7 +7735,9 @@ arm_load_pic_register (unsigned long saved_regs ATTRIBUTE_UNUSED, rtx pic_reg) >> { >> rtx l1, labelno, pic_tmp, pic_rtx; >> >> - if (crtl->uses_pic_offset_table == 0 || TARGET_SINGLE_PIC_BASE) >> + if (crtl->uses_pic_offset_table == 0 >> + || TARGET_SINGLE_PIC_BASE >> + || TARGET_FDPIC) >> return; >> >> gcc_assert (flag_pic); >> @@ -7780,28 +7806,142 @@ arm_load_pic_register (unsigned long saved_regs ATTRIBUTE_UNUSED, rtx pic_reg) >> emit_use (pic_reg); >> } >> >> +/* Try to determine whether an object, referenced via ORIG, will be >> + placed in the text or data segment. This is used in FDPIC mode, to >> + decide which relocations to use when accessing ORIG. IS_READONLY >> + is set to true if ORIG is a read-only location, false otherwise. >> + Return true if we could determine the location of ORIG, false >> + otherwise. IS_READONLY is valid only when we return true. */ > > Maybe *IS_READONLY in both cases? > >> +static bool >> +arm_is_segment_info_known (rtx orig, bool *is_readonly) >> +{ >> + bool res = false; >> + >> + *is_readonly = false; >> + >> + if (GET_CODE (orig) == LABEL_REF) >> + { >> + res = true; >> + *is_readonly = true; >> + } > > Think this function would be easier to read with early returns. > >> + else if (SYMBOL_REF_P (orig)) > > ...so "if" rather than "else if" here. > >> + { >> + if (CONSTANT_POOL_ADDRESS_P (orig)) >> + { >> + res = true; >> + *is_readonly = true; >> + } >> + else if (SYMBOL_REF_LOCAL_P (orig) >> + && !SYMBOL_REF_EXTERNAL_P (orig) >> + && SYMBOL_REF_DECL (orig) >> + && (!DECL_P (SYMBOL_REF_DECL (orig)) >> + || !DECL_COMMON (SYMBOL_REF_DECL (orig)))) >> + { >> + tree decl = SYMBOL_REF_DECL (orig); >> + tree init = (TREE_CODE (decl) == VAR_DECL) >> + ? DECL_INITIAL (decl) : (TREE_CODE (decl) == CONSTRUCTOR) >> + ? decl : 0; >> + int reloc = 0; >> + bool named_section, readonly; >> + >> + if (init && init != error_mark_node) >> + reloc = compute_reloc_for_constant (init); >> + >> + named_section = TREE_CODE (decl) == VAR_DECL >> + && lookup_attribute ("section", DECL_ATTRIBUTES (decl)); > > Here too I think it would be better to return false early. > > How much variation do you support here for named sections? E.g. can a > linker script really put SECTION_WRITE sections in the text segment? > Seems like there are some cases that could be handled. > > (Just asking, not suggesting you should change anything.) > >> + readonly = decl_readonly_section (decl, reloc); >> + >> + /* We don't know where the link script will put a named >> + section, so return false in such a case. */ >> + res = !named_section; >> + >> + if (!named_section) >> + *is_readonly = readonly; >> + } >> + else >> + { >> + /* We don't know. */ >> + res = false; >> + } >> + } >> + else >> + gcc_unreachable (); >> + >> + return res; >> +} >> + >> /* Generate code to load the address of a static var when flag_pic is set. */ >> static rtx_insn * >> arm_pic_static_addr (rtx orig, rtx reg) >> { >> rtx l1, labelno, offset_rtx; >> + rtx_insn *insn; >> >> gcc_assert (flag_pic); >> >> - /* We use an UNSPEC rather than a LABEL_REF because this label >> - never appears in the code stream. */ >> - labelno = GEN_INT (pic_labelno++); >> - l1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, labelno), UNSPEC_PIC_LABEL); >> - l1 = gen_rtx_CONST (VOIDmode, l1); >> + bool is_readonly = false; >> + bool info_known = false; >> >> - /* On the ARM the PC register contains 'dot + 8' at the time of the >> - addition, on the Thumb it is 'dot + 4'. */ >> - offset_rtx = plus_constant (Pmode, l1, TARGET_ARM ? 8 : 4); >> - offset_rtx = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, orig, offset_rtx), >> - UNSPEC_SYMBOL_OFFSET); >> - offset_rtx = gen_rtx_CONST (Pmode, offset_rtx); >> + if (TARGET_FDPIC >> + && SYMBOL_REF_P (orig) >> + && !SYMBOL_REF_FUNCTION_P (orig)) >> + info_known = arm_is_segment_info_known (orig, &is_readonly); > > Excess indendentation. Feels like it might be slightly simpler > to handle SYMBOL_REF_FUNCTION_P in arm_is_segment_info_known, > but I guess the idea is that it might not then be clear whether > the caller is asking about a descriptor or the function itself. > >> >> - return emit_insn (gen_pic_load_addr_unified (reg, offset_rtx, labelno)); >> + if (TARGET_FDPIC >> + && SYMBOL_REF_P (orig) >> + && !SYMBOL_REF_FUNCTION_P (orig) >> + && !info_known) >> + { >> + /* We don't know where orig is stored, so we have be >> + pessimistic and use a GOT relocation. */ >> + rtx pat; >> + rtx mem; >> + rtx pic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM); >> + >> + pat = gen_calculate_pic_address (reg, pic_reg, orig); >> + >> + /* Make the MEM as close to a constant as possible. */ >> + mem = SET_SRC (pat); >> + gcc_assert (MEM_P (mem) && !MEM_VOLATILE_P (mem)); >> + MEM_READONLY_P (mem) = 1; >> + MEM_NOTRAP_P (mem) = 1; >> + >> + insn = emit_insn (pat); > > Think "pat = ..." onwards should be split out into a helper, since it's > a cut-&-paste of the code in legitimize_pic_address. > >> + } >> + else if (TARGET_FDPIC >> + && SYMBOL_REF_P (orig) >> + && (SYMBOL_REF_FUNCTION_P (orig) >> + || (info_known && !is_readonly))) >> + { >> + /* We use the GOTOFF relocation. */ >> + rtx pic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM); >> + >> + rtx l1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, orig), UNSPEC_PIC_SYM); >> + emit_insn (gen_movsi (reg, l1)); >> + insn = emit_insn (gen_addsi3 (reg, reg, pic_reg)); >> + } >> + else >> + { >> + /* Not FDPIC, not SYMBOL_REF_P or readonly: we can use >> + PC-relative access. */ >> + /* We use an UNSPEC rather than a LABEL_REF because this label >> + never appears in the code stream. */ >> + labelno = GEN_INT (pic_labelno++); >> + l1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, labelno), UNSPEC_PIC_LABEL); >> + l1 = gen_rtx_CONST (VOIDmode, l1); >> + >> + /* On the ARM the PC register contains 'dot + 8' at the time of the >> + addition, on the Thumb it is 'dot + 4'. */ >> + offset_rtx = plus_constant (Pmode, l1, TARGET_ARM ? 8 : 4); >> + offset_rtx = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, orig, offset_rtx), >> + UNSPEC_SYMBOL_OFFSET); >> + offset_rtx = gen_rtx_CONST (Pmode, offset_rtx); >> + >> + insn = emit_insn (gen_pic_load_addr_unified (reg, offset_rtx, >> + labelno)); >> + } >> + >> + return insn; >> } >> >> /* Return nonzero if X is valid as an ARM state addressing register. */ >> @@ -16112,9 +16252,36 @@ get_jump_table_size (rtx_jump_table_data *insn) >> return 0; >> } >> >> +/* Emit insns to load the function address from FUNCDESC (an FDPIC >> + function descriptor) into a register and the GOT address into the >> + FDPIC register, returning an rtx for the register holding the >> + function address. */ >> + >> +rtx >> +arm_load_function_descriptor (rtx funcdesc) >> +{ >> + rtx fnaddr_reg = gen_reg_rtx (Pmode); >> + rtx pic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM); >> + rtx fnaddr = gen_rtx_MEM (Pmode, funcdesc); >> + rtx gotaddr = gen_rtx_MEM (Pmode, plus_constant (Pmode, funcdesc, 4)); >> + rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (3)); >> + >> + emit_move_insn (fnaddr_reg, fnaddr); >> + /* The ABI requires the entry point address to be loaded first, so >> + prevent the load from being moved after that of the GOT >> + address. */ > > Do you mean that the move insn above has to come before the > pattern below? If so, I think that should be enforced by making this... > >> + XVECEXP (par, 0, 0) = gen_rtx_UNSPEC (VOIDmode, >> + gen_rtvec (2, pic_reg, gotaddr), >> + UNSPEC_PIC_RESTORE); >> + XVECEXP (par, 0, 1) = gen_rtx_USE (VOIDmode, gotaddr); >> + XVECEXP (par, 0, 2) = gen_rtx_CLOBBER (VOIDmode, pic_reg); >> + emit_insn (par); >> + >> + return fnaddr_reg; >> +} >> + > > ...use fnaddr_reg. > > Does the instruction actually use pic_reg? We only get here for > non-symbolic addresses after all. > > It seems simpler to make *restore_pic_register_after_call a named pattern > and use gen_restore_pic_register_after_call instead. > >> /* Return the maximum amount of padding that will be inserted before >> label LABEL. */ >> - >> static HOST_WIDE_INT >> get_label_padding (rtx label) >> { >> @@ -23069,9 +23236,37 @@ arm_assemble_integer (rtx x, unsigned int size, int aligned_p) >> && (!SYMBOL_REF_LOCAL_P (x) >> || (SYMBOL_REF_DECL (x) >> ? DECL_WEAK (SYMBOL_REF_DECL (x)) : 0)))) >> - fputs ("(GOT)", asm_out_file); >> + { >> + if (TARGET_FDPIC && SYMBOL_REF_FUNCTION_P (x)) >> + fputs ("(GOTFUNCDESC)", asm_out_file); >> + else >> + fputs ("(GOT)", asm_out_file); >> + } >> else >> - fputs ("(GOTOFF)", asm_out_file); >> + { >> + if (TARGET_FDPIC && SYMBOL_REF_FUNCTION_P (x)) >> + fputs ("(GOTOFFFUNCDESC)", asm_out_file); >> + else >> + { >> + bool is_readonly; >> + >> + if (arm_is_segment_info_known (x, &is_readonly)) >> + fputs ("(GOTOFF)", asm_out_file); >> + else >> + fputs ("(GOT)", asm_out_file); >> + } >> + } >> + } >> + >> + /* For FDPIC we also have to mark symbol for .data section. */ >> + if (TARGET_FDPIC >> + && NEED_GOT_RELOC >> + && flag_pic >> + && !making_const_table >> + && SYMBOL_REF_P (x)) >> + { >> + if (SYMBOL_REF_FUNCTION_P (x)) >> + fputs ("(FUNCDESC)", asm_out_file); >> } >> fputc ('\n', asm_out_file); >> return true; > > Do you expect to reach here for LABEL_REFs with TARGET_FDPIC? The second > block of code tests for SYMBOL_REF_P but the first tests > SYMBOL_REF_FUNCTION_P without checking SYMBOL_REF_P first. > > Can NEED_GOT_RELOC or flag_pic be false for TARGET_FDPIC? > Is !flag_pic TARGET_FDPIC supported? > >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md >> index 0aecd03..9036255 100644 >> --- a/gcc/config/arm/arm.md >> +++ b/gcc/config/arm/arm.md >> @@ -8127,6 +8127,23 @@ >> rtx callee, pat; >> tree addr = MEM_EXPR (operands[0]); >> >> + /* Force FDPIC register (r9) before call. */ >> + if (TARGET_FDPIC) >> + { >> + /* No need to update r9 if calling a static function. >> + In other words: set r9 for indirect or non-local calls. */ >> + callee = XEXP (operands[0], 0); >> + if (!SYMBOL_REF_P (callee) >> + || !SYMBOL_REF_LOCAL_P (callee) >> + || arm_is_long_call_p (SYMBOL_REF_DECL (callee))) > > IMO it would be better to calculate this once rather than repeat > it below. > >> + { >> + emit_insn (gen_blockage ()); > > Why's the blockage needed? Seems worth a comment. > >> + rtx pic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM); >> + emit_move_insn (pic_reg, get_hard_reg_initial_val (Pmode, FDPIC_REGNUM)); >> + emit_insn (gen_rtx_USE (VOIDmode, pic_reg)); > > Is this use keeping the register live for the call? If so, > I think it'd be better to attach it to the CALL_INSN_FUNCTION_USAGE > instead. > >> + } >> + } >> + >> /* In an untyped call, we can get NULL for operand 2. */ >> if (operands[2] == NULL_RTX) >> operands[2] = const0_rtx; >> @@ -8140,6 +8157,13 @@ >> : !REG_P (callee)) >> XEXP (operands[0], 0) = force_reg (Pmode, callee); >> >> + if (TARGET_FDPIC && !SYMBOL_REF_P (XEXP (operands[0], 0))) >> + { >> + /* Indirect call: set r9 with FDPIC value of callee. */ >> + XEXP (operands[0], 0) >> + = arm_load_function_descriptor (XEXP (operands[0], 0)); >> + } >> + >> if (detect_cmse_nonsecure_call (addr)) >> { >> pat = gen_nonsecure_call_internal (operands[0], operands[1], > > Redundant braces. > >> @@ -8151,10 +8175,38 @@ >> pat = gen_call_internal (operands[0], operands[1], operands[2]); >> arm_emit_call_insn (pat, XEXP (operands[0], 0), false); >> } >> + >> + /* Restore FDPIC register (r9) after call. */ >> + if (TARGET_FDPIC) >> + { >> + /* No need to update r9 if calling a static function. */ >> + if (!SYMBOL_REF_P (callee) >> + || !SYMBOL_REF_LOCAL_P (callee) >> + || arm_is_long_call_p (SYMBOL_REF_DECL (callee))) >> + { >> + rtx pic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM); >> + emit_move_insn (pic_reg, get_hard_reg_initial_val (Pmode, FDPIC_REGNUM)); >> + emit_insn (gen_rtx_USE (VOIDmode, pic_reg)); >> + emit_insn (gen_blockage ()); >> + } >> + } >> DONE; >> }" >> ) > > What's the general assumption about the validity of r9? Seems odd that > we need to load this value both before and after the call. > >> >> +(define_insn "*restore_pic_register_after_call" >> + [(parallel [(unspec [(match_operand:SI 0 "s_register_operand" "=r,r") >> + (match_operand:SI 1 "nonimmediate_operand" "r,m")] >> + UNSPEC_PIC_RESTORE) >> + (use (match_dup 1)) >> + (clobber (match_dup 0))]) >> + ] >> + "" >> + "@ >> + mov\t%0, %1 >> + ldr\t%0, %1" >> +) >> + >> (define_expand "call_internal" >> [(parallel [(call (match_operand 0 "memory_operand" "") >> (match_operand 1 "general_operand" "")) > > Since operand 0 is significant after the instruction, I think this > should be: > > (define_insn "*restore_pic_register_after_call" > [(set (match_operand:SI 0 "s_register_operand" "+r,r") > (unspec:SI [(match_dup 0) > (match_operand:SI 1 "nonimmediate_operand" "r,m")] > UNSPEC_PIC_RESTORE))] > ... > > The (use (match_dup 1)) looks redundant, since the unspec itself > uses operand 1. > When I try that, I have cases where the restore instruction is discarded, when the call happens just before function return. Since r9 is caller-saved, it should be restored but after dse2 the dumps say: (insn (set (reg:SI 9 r9) (unspec:SI [ (reg:SI 9 r9) (reg:SI 4 r4 [121]) ] UNSPEC_PIC_RESTORE)) (expr_list:REG_UNUSED (reg:SI 9 r9) (nil)))) and this is later removed by cprop_hardreg (which says the exit block uses r4, sp, and lr: should I make it use r9?) Thanks, Christophe >> @@ -8215,6 +8267,30 @@ >> rtx pat, callee; >> tree addr = MEM_EXPR (operands[1]); >> >> + /* Force FDPIC register (r9) before call. */ >> + if (TARGET_FDPIC) >> + { >> + /* No need to update the FDPIC register (r9) if calling a static function. >> + In other words: set r9 for indirect or non-local calls. */ >> + callee = XEXP (operands[1], 0); >> + if (!SYMBOL_REF_P (callee) >> + || !SYMBOL_REF_LOCAL_P (callee) >> + || arm_is_long_call_p (SYMBOL_REF_DECL (callee))) >> + { >> + rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (3)); >> + rtx fdpic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM); >> + rtx initial_fdpic_reg = >> + get_hard_reg_initial_val (Pmode, FDPIC_REGNUM); >> + >> + XVECEXP (par, 0, 0) = gen_rtx_UNSPEC (VOIDmode, >> + gen_rtvec (2, fdpic_reg, initial_fdpic_reg), >> + UNSPEC_PIC_RESTORE); >> + XVECEXP (par, 0, 1) = gen_rtx_USE (VOIDmode, initial_fdpic_reg); >> + XVECEXP (par, 0, 2) = gen_rtx_CLOBBER (VOIDmode, fdpic_reg); >> + emit_insn (par); >> + } >> + } >> + > > It's not obvious why this code is different from the call-without-value > case above, which doesn't use UNSPEC_PIC_RESTORE. I think it should be > split out into a helper function that's used for both call and call_value. > > I think it would also be good to have more comments about what > conditions the UNSPEC_PIC_RESTORE pattern is enforcing. > > Thanks, > Richard > . >