From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1]) by sourceware.org (Postfix) with ESMTPS id 6F65E3858D32 for ; Fri, 4 Nov 2022 02:57:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6F65E3858D32 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1667530622; bh=ggimCLm2o/NOivXFNCiWPI8HhBDYg0D/yFXcCkLoLMg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=FFDeIjTLdu+k0e49TMuj6NDsklpSw/lIL4V/09PxnAk653KX3PlYHtrhMBGAB9T7P G1PSPTtfEOwAUN4N73YAwxQjuU8mfV7eMLT0HP27/T3qFl7Q0RCKKPnYv9YMw70pmz H4b34s/tiXjgLBvtOxWXaXbwFZ7WTnPXzR2ZQ5Mo= Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 5C081668E6; Thu, 3 Nov 2022 22:57:01 -0400 (EDT) Message-ID: <2d547e6296b28f9e8efcc3352fb363ca8967bff6.camel@xry111.site> Subject: Re: [PATCH v3] LoongArch: Optimize immediate load. From: Xi Ruoyao To: Lulu Cheng , gcc-patches@gcc.gnu.org Cc: i@xen0n.name, xuchenghua@loongson.cn Date: Fri, 04 Nov 2022 10:56:59 +0800 In-Reply-To: <2250cca5-a1c9-aecc-9b01-1b6e52ec5e06@loongson.cn> References: <20221101120444.412376-1-chenglulu@loongson.cn> <5b19fb73fc15ae68951118a96393ed2222b41190.camel@xry111.site> <2250cca5-a1c9-aecc-9b01-1b6e52ec5e06@loongson.cn> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.0 MIME-Version: 1.0 X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_SUSPICIOUS_NTLD,LIKELY_SPAM_FROM,PDS_OTHER_BAD_TLD,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 2022-11-04 at 10:33 +0800, Lulu Cheng wrote: >=20 > =E5=9C=A8 2022/11/4 =E4=B8=8A=E5=8D=8810:22, Xi Ruoyao =E5=86=99=E9=81=93= : > > On Tue, 2022-11-01 at 20:04 +0800, Lulu Cheng wrote: > > > gcc/ChangeLog: > > >=20 > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loonga= rch/constraints.md (x): New constraint. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loonga= rch/loongarch.cc (struct loongarch_address_info): > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0Adds a method t= o load the immediate 32 to 64 bit field. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(struct loongar= ch_integer_op): Define a new member curr_value, > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0that records th= e value of the number stored in the destination > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0register immedi= ately after the current instruction has run. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(LARCH_MAX_INTE= GER_OPS): Define this macro as 3. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(LU32I_B): Move= to the loongarch.h. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(LU52I_B): Like= wise. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(loongarch_buil= d_integer): Adds a method to load the immediate > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A032 to 63 bits. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(loongarch_move= _integer): Likewise. > > We need to mention "call set_unique_reg_note" here because it seems the > > key to resolve the issue. >=20 > During debugging, I found the problem because the source register and=20 > destination register of the lu32i.d instruction are the same. As a > result, during loop2_invariant pass, the destination register of > lu32i.d is used twice, so the instructions after this instruction will > not be brought out of the loop. Therefore, I combined lu32i.d and > lu52i.d into one template, which avoids the situation that the same > register is used twice. It is not split into two instructions until > loop2_invariant has been optimized. So I don't think > "set_unique_reg_note" plays a decisive role in this optimization. It's better to mention this logic in the commit message then, to prevent others from misunderstandings like mine. Again the code change LGTM and I've tested it with --with-build- config=3Dbootstrap-ubsan. > >=20 > > Otherwise LGTM. > >=20 > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(loongarch_prin= t_operand_reloc): Modifying comment information. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loonga= rch/loongarch.h (LU32I_B): Move from loongarch.cc. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(LU52I_B): Like= wise. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(HWIT_UC_0xFFFF= FFFF): New macro. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(HI32_OPERAND):= New macro. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loonga= rch/loongarch.md (load_hi32): New template. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loonga= rch/predicates.md (const_hi32_operand): Determines > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0whether the val= ue is an immediate number that has a value of only > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0the higher 32 b= its. > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(hi32_mask_oper= and): Immediately counts the mask of 32 to 61 bits. >=20 --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University