From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 3D3503858D33 for ; Fri, 2 Feb 2024 01:15:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3D3503858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3D3503858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706836552; cv=none; b=A9rDz1Jj+a3BCChVztPZZVdEJdM7vB63nyoJ0YnYIfRAVdm3NFa2iNqOjkna0ieVL/CKG0v99yYTxHz96c9whM8N/nriypk82lOs8a9yB1yjdSb2tMO+mEFsqfwIji0zsM4ePr8DbmoniG4I3iLAafndD6OBSl3/pE6StykAJ7k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706836552; c=relaxed/simple; bh=ii2uah0zas7X3qSiAAjaTZUVXf4pVg+bIHvIBw3xzss=; h=Subject:To:From:Message-ID:Date:MIME-Version; b=w1uQm/RQYh3Yv+bwMmhAitXT1Y9044kejAEsFZ/cQ2U1Vt1t/NqG/SVD8e7XTcYzVe8zdPGRNqfH4+x6lqvu5/3nZTHqIBHA8Ok1FhF8jahqDXtrZ0pr1LQyfZicJ0e3JFi6zSFot/gKJV65Aebz2fqFIYZ4JsAZDApPg4QK5QY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [10.20.4.107]) by gateway (Coremail) with SMTP id _____8CxbetBQrxlk_QJAA--.28911S3; Fri, 02 Feb 2024 09:15:45 +0800 (CST) Received: from [10.20.4.107] (unknown [10.20.4.107]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Dxfs1AQrxlWbksAA--.38107S3; Fri, 02 Feb 2024 09:15:44 +0800 (CST) Subject: Re: [pushed][PATCH v2] LoongArch: Modify the address calculation logic for obtaining array element values through fp. To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, xuchenghua@loongson.cn References: <20240130075459.1347-1-chenglulu@loongson.cn> From: chenglulu Message-ID: <28214b50-66f6-7b19-5410-d91e65694750@loongson.cn> Date: Fri, 2 Feb 2024 09:15:44 +0800 User-Agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20240130075459.1347-1-chenglulu@loongson.cn> Content-Type: text/plain; charset=gbk; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID:AQAAf8Dxfs1AQrxlWbksAA--.38107S3 X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBj93XoWxZw15Aw45urWUWw18Aw4kKrX_yoWrAw4xpr WxAa43KFWDXrnF9a17G34jvFn8JryfGr4Yga92qryvkrn7Wr97GF4kJ34Yqa1UK3yUJrW2 qF1xt39ruFW7A3cCm3ZEXasCq-sJn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUv0b4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_ Gr0_Gr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx1l5I 8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AK xVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lc7I2V7IY0VAS07AlzV AYIcxG8wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU1CPfJUU UUU== X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,MIME_CHARSET_FARAWAY,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Pushed to r14-8716. ÔÚ 2024/1/30 ÏÂÎç3:55, Lulu Cheng дµÀ: > Modify address calculation logic from (((a x C) + fp) + offset) to ((fp + offset) + a x C). > Thereby modifying the register dependencies and optimizing the code. > The value of C is 2 4 or 8. > > The following is the assembly code before and after a loop modification in spec2006 401.bzip: > > old | new > 735 .L71: | 735 .L71: > 736 slli.d $r12,$r15,2 | 736 slli.d $r12,$r15,2 > 737 ldx.w $r13,$r22,$r12 | 737 ldx.w $r13,$r22,$r12 > 738 addi.d $r15,$r15,-1 | 738 addi.d $r15,$r15,-1 > 739 slli.w $r16,$r15,0 | 739 slli.w $r16,$r15,0 > 740 addi.w $r13,$r13,-1 | 740 addi.w $r13,$r13,-1 > 741 slti $r14,$r13,0 | 741 slti $r14,$r13,0 > 742 add.w $r12,$r26,$r13 | 742 add.w $r12,$r26,$r13 > 743 maskeqz $r12,$r12,$r14 | 743 maskeqz $r12,$r12,$r14 > 744 masknez $r14,$r13,$r14 | 744 masknez $r14,$r13,$r14 > 745 or $r12,$r12,$r14 | 745 or $r12,$r12,$r14 > 746 ldx.bu $r14,$r30,$r12 | 746 ldx.bu $r14,$r30,$r12 > 747 lu12i.w $r13,4096>>12 | 747 alsl.d $r14,$r14,$r18,2 > 748 ori $r13,$r13,432 | 748 ldptr.w $r13,$r14,0 > 749 add.d $r13,$r13,$r3 | 749 addi.w $r17,$r13,-1 > 750 alsl.d $r14,$r14,$r13,2 | 750 stptr.w $r17,$r14,0 > 751 ldptr.w $r13,$r14,-1968 | 751 slli.d $r13,$r13,2 > 752 addi.w $r17,$r13,-1 | 752 stx.w $r12,$r22,$r13 > 753 st.w $r17,$r14,-1968 | 753 ldptr.w $r12,$r19,0 > 754 slli.d $r13,$r13,2 | 754 blt $r12,$r16,.L71 > 755 stx.w $r12,$r22,$r13 | 755 .align 4 > 756 ldptr.w $r12,$r18,-2048 | 756 > 757 blt $r12,$r16,.L71 | 757 > 758 .align 4 | 758 > > This patch is ported from riscv's commit r14-3111. > > gcc/ChangeLog: > > * config/loongarch/loongarch.cc (mem_shadd_or_shadd_rtx_p): New function. > (loongarch_legitimize_address): Add logical transformation code. > > --- > v1 -> v2: > Modify code format and comment information. > > --- > gcc/config/loongarch/loongarch.cc | 43 +++++++++++++++++++++++++++++++ > 1 file changed, 43 insertions(+) > > diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc > index b494040d165..b8f6f6689bb 100644 > --- a/gcc/config/loongarch/loongarch.cc > +++ b/gcc/config/loongarch/loongarch.cc > @@ -3219,6 +3219,22 @@ loongarch_split_symbol (rtx temp, rtx addr, machine_mode mode, rtx *low_out) > return true; > } > > +/* Helper loongarch_legitimize_address. Given X, return true if it > + is a left shift by 1, 2 or 3 positions or a multiply by 2, 4 or 8. > + > + This respectively represent canonical shift-add rtxs or scaled > + memory addresses. */ > +static bool > +mem_shadd_or_shadd_rtx_p (rtx x) > +{ > + return ((GET_CODE (x) == ASHIFT > + || GET_CODE (x) == MULT) > + && CONST_INT_P (XEXP (x, 1)) > + && ((GET_CODE (x) == ASHIFT && IN_RANGE (INTVAL (XEXP (x, 1)), 1, 3)) > + || (GET_CODE (x) == MULT > + && IN_RANGE (exact_log2 (INTVAL (XEXP (x, 1))), 1, 3)))); > +} > + > /* This function is used to implement LEGITIMIZE_ADDRESS. If X can > be legitimized in a way that the generic machinery might not expect, > return a new address, otherwise return NULL. MODE is the mode of > @@ -3242,6 +3258,33 @@ loongarch_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, > loongarch_split_plus (x, &base, &offset); > if (offset != 0) > { > + /* Handle (plus (plus (mult (a) (mem_shadd_constant)) (fp)) (C)) case. */ > + if (GET_CODE (base) == PLUS && mem_shadd_or_shadd_rtx_p (XEXP (base, 0)) > + && IMM12_OPERAND (offset)) > + { > + rtx index = XEXP (base, 0); > + rtx fp = XEXP (base, 1); > + > + if (REG_P (fp) && REGNO (fp) == VIRTUAL_STACK_VARS_REGNUM) > + { > + /* If we were given a MULT, we must fix the constant > + as we're going to create the ASHIFT form. */ > + int shift_val = INTVAL (XEXP (index, 1)); > + if (GET_CODE (index) == MULT) > + shift_val = exact_log2 (shift_val); > + > + rtx reg1 = gen_reg_rtx (Pmode); > + rtx reg3 = gen_reg_rtx (Pmode); > + loongarch_emit_binary (PLUS, reg1, fp, GEN_INT (offset)); > + loongarch_emit_binary (PLUS, reg3, > + gen_rtx_ASHIFT (Pmode, XEXP (index, 0), > + GEN_INT (shift_val)), > + reg1); > + > + return reg3; > + } > + } > + > if (!loongarch_valid_base_register_p (base, mode, false)) > base = copy_to_mode_reg (Pmode, base); > addr = loongarch_add_offset (NULL, base, offset);