From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E122538515E1 for ; Mon, 10 May 2021 12:05:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E122538515E1 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6674515BE; Mon, 10 May 2021 05:05:54 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DF2D33F73B; Mon, 10 May 2021 05:05:53 -0700 (PDT) From: Richard Sandiford To: Wilco Dijkstra via Gcc-patches Mail-Followup-To: Wilco Dijkstra via Gcc-patches , Wilco Dijkstra , richard.sandiford@arm.com Cc: Wilco Dijkstra Subject: Re: [PATCH] AArch64: Improve GOT addressing References: Date: Mon, 10 May 2021 13:05:52 +0100 In-Reply-To: (Wilco Dijkstra via Gcc-patches's message of "Wed, 5 May 2021 13:17:13 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_STOCKGEN, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2021 12:06:01 -0000 Wilco Dijkstra via Gcc-patches writes: > Improve GOT addressing by emitting the instructions as a pair. This reduces > register pressure and improves code quality. With -fPIC codesize improves by > 0.65% and SPECINT2017 improves by 0.25%. > > Passes bootstrap and regress. OK for commit? Normally we should only put two instructions in the same define_insn if there's a specific ABI or architectural reason for not separating them. Doing it purely for optimisation reasons is going against the general direction of travel. So I think the first question is: why don't we simply delay the split until after reload instead, since that's the more normal way of handling this kind of thing? Also, the patch means that we use RTL of the form: (set (reg:PTR R) (unspec:PTR [(mem:PTR (symbol_ref:PTR S))] UNSPEC_GOTSMALLPIC)) to represent the move of S into R. This should just be represented as: (set (reg:PTR R) (symbol_ref:PTR S)) and go through the normal move patterns. Thanks, Richard > ChangeLog: > 2021-05-05 Wilco Dijkstra > > * config/aarch64/aarch64.md (ldr_got_small_): Emit ADRP+LDR GOT sequence. > (ldr_got_small_sidi): Likewise. > * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Remove tmp_reg. > (aarch64_print_operand): Correctly print got_lo12 in L specifier. > > --- > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 641c83b479e76cbcc75b299eb7ae5f634d9db7cd..32c5c76d3c001a79d2a69b7f8243f1f1f605f901 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -3625,27 +3625,21 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm, > > rtx insn; > rtx mem; > - rtx tmp_reg = dest; > machine_mode mode = GET_MODE (dest); > > - if (can_create_pseudo_p ()) > - tmp_reg = gen_reg_rtx (mode); > - > - emit_move_insn (tmp_reg, gen_rtx_HIGH (mode, imm)); > if (mode == ptr_mode) > { > if (mode == DImode) > - insn = gen_ldr_got_small_di (dest, tmp_reg, imm); > + insn = gen_ldr_got_small_di (dest, imm); > else > - insn = gen_ldr_got_small_si (dest, tmp_reg, imm); > + insn = gen_ldr_got_small_si (dest, imm); > > mem = XVECEXP (SET_SRC (insn), 0, 0); > } > else > { > gcc_assert (mode == Pmode); > - > - insn = gen_ldr_got_small_sidi (dest, tmp_reg, imm); > + insn = gen_ldr_got_small_sidi (dest, imm); > mem = XVECEXP (XEXP (SET_SRC (insn), 0), 0, 0); > } > > @@ -11019,7 +11013,7 @@ aarch64_print_operand (FILE *f, rtx x, int code) > switch (aarch64_classify_symbolic_expression (x)) > { > case SYMBOL_SMALL_GOT_4G: > - asm_fprintf (asm_out_file, ":lo12:"); > + asm_fprintf (asm_out_file, ":got_lo12:"); > break; > > case SYMBOL_SMALL_TLSGD: > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index abfd84526745d029ad4953eabad6dd17b159a218..36c5c054f86e9cdd1f0945cdbc1beb47aa7ad80a 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -6705,25 +6705,23 @@ (define_insn "add_losym_" > > (define_insn "ldr_got_small_" > [(set (match_operand:PTR 0 "register_operand" "=r") > - (unspec:PTR [(mem:PTR (lo_sum:PTR > - (match_operand:PTR 1 "register_operand" "r") > - (match_operand:PTR 2 "aarch64_valid_symref" "S")))] > + (unspec:PTR [(mem:PTR (match_operand:PTR 1 "aarch64_valid_symref" "S"))] > UNSPEC_GOTSMALLPIC))] > "" > - "ldr\\t%0, [%1, #:got_lo12:%c2]" > - [(set_attr "type" "load_")] > + "adrp\\t%0, %A1\;ldr\\t%0, [%0, %L1]" > + [(set_attr "type" "load_") > + (set_attr "length" "8")] > ) > > (define_insn "ldr_got_small_sidi" > [(set (match_operand:DI 0 "register_operand" "=r") > (zero_extend:DI > - (unspec:SI [(mem:SI (lo_sum:DI > - (match_operand:DI 1 "register_operand" "r") > - (match_operand:DI 2 "aarch64_valid_symref" "S")))] > + (unspec:SI [(mem:SI (match_operand:DI 1 "aarch64_valid_symref" "S"))] > UNSPEC_GOTSMALLPIC)))] > "TARGET_ILP32" > - "ldr\\t%w0, [%1, #:got_lo12:%c2]" > - [(set_attr "type" "load_4")] > + "adrp\\t%0, %A1\;ldr\\t%w0, [%0, %L1]" > + [(set_attr "type" "load_4") > + (set_attr "length" "8")] > ) > > (define_insn "ldr_got_small_28k_"