From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id 0B7593858D20 for ; Sun, 12 Nov 2023 17:42:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0B7593858D20 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0B7593858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699810928; cv=none; b=RZJkxADE117BkBwYnhVa4/VHBWNF9UhWEYt/3kizKwmou+LxMvBy9s+xm6CwekO9y2dtxIRTlXyjO6iCLLfxUaq6WW7tJT4R7HYb4DIEIWn9ifYryFGLBtQs2y1e9iFkQ90+k9Ezhzw0U4UlJJ+1BYd9nz/37G4LP69FqGytjxg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699810928; c=relaxed/simple; bh=09+SaY4nuVNdgBCtoqCzhm0f+00XtExGu/xj4yP1qS8=; h=DKIM-Signature:Message-ID:Subject:From:To:Date:MIME-Version; b=S9iqfEF9y1kbjdafc3xu8NzMdtG7gq+ZSalWcxuOhcLKUc1FeuAcIAeiy5GcT5w0DQ1ls8QzjDfn9DGb0oyV/iNk41g0Q9RBPAsE26DuWucA29VKJ1/nXaYSSQI2I6HC73KJP/aHyCMVpgpPnIIQKlJP7fphG1aVs4k8Dgps9HI= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1699810916; bh=09+SaY4nuVNdgBCtoqCzhm0f+00XtExGu/xj4yP1qS8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=muPJDuFBR7wh32HQoQTTrom3hCAWg7gDT+vC/ZPNgRmDrxks4P0XdgeBZL3zRlF9O EYSBA03RX3dtJbCr3poD6/Zdvbx2H4Qqkn//a6oIpYlCBu5JSIDGTXy6bOT1rf83BD p8DhhWIO18LMZJjh93TwJs+ma8NKuwvPoo05+qFg= Received: from [127.0.0.1] (unknown [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 99D4766A03; Sun, 12 Nov 2023 12:41:54 -0500 (EST) Message-ID: <37172a21172cbe6ddb580d2619ca3bf3e93b580c.camel@xry111.site> Subject: Re: [PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range. From: Xi Ruoyao To: Jeff Law , Jin Ma , gcc-patches@gcc.gnu.org Cc: palmer@dabbelt.com, richard.sandiford@arm.com, kito.cheng@gmail.com, philipp.tomsich@vrull.eu, christoph.muellner@vrull.eu, rdapp.gcc@gmail.com, juzhe.zhong@rivai.ai, vineetg@rivosinc.com, jinma.contrib@gmail.com Date: Mon, 13 Nov 2023 01:41:52 +0800 In-Reply-To: References: <20230814112255.2071-1-jinma@linux.alibaba.com> Autocrypt: addr=xry111@xry111.site; prefer-encrypt=mutual; keydata=mDMEYnkdPhYJKwYBBAHaRw8BAQdAsY+HvJs3EVKpwIu2gN89cQT/pnrbQtlvd6Yfq7egugi0HlhpIFJ1b3lhbyA8eHJ5MTExQHhyeTExMS5zaXRlPoiTBBMWCgA7FiEEkdD1djAfkk197dzorKrSDhnnEOMFAmJ5HT4CGwMFCwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgkQrKrSDhnnEOPHFgD8D9vUToTd1MF5bng9uPJq5y3DfpcxDp+LD3joA3U2TmwA/jZtN9xLH7CGDHeClKZK/ZYELotWfJsqRcthOIGjsdAPuDgEYnkdPhIKKwYBBAGXVQEFAQEHQG+HnNiPZseiBkzYBHwq/nN638o0NPwgYwH70wlKMZhRAwEIB4h4BBgWCgAgFiEEkdD1djAfkk197dzorKrSDhnnEOMFAmJ5HT4CGwwACgkQrKrSDhnnEOPjXgD/euD64cxwqDIqckUaisT3VCst11RcnO5iRHm6meNIwj0BALLmWplyi7beKrOlqKfuZtCLbiAPywGfCNg8LOTt4iMD Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.50.1 MIME-Version: 1.0 X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,LIKELY_SPAM_FROM,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, 2023-11-11 at 13:12 -0700, Jeff Law wrote: >=20 >=20 > On 8/14/23 05:22, Jin Ma wrote: > > CLOBBER and USE does not represent real instructions, but in the > > process of pipeline optimization, they will wait for transmission > > in ready list like other insns, without considering resource > > conflicts and cycles. This results in a multi-issue CPU architecture > > that can be issued at any time if other regular insns have resource > > conflicts or cannot be launched for other reasons. As a result, > > its position is advanced in the generated insns sequence, which > > will affect register allocation and often lead to more redundant > > mov instructions. > >=20 > > A simple example: > > https://github.com/majin2020/gcc-test/blob/master/test.c > > This is a function in the dhrystone benchmark. > >=20 > > https://github.com/majin2020/gcc-test/blob/0b08c1a13de9663d7d9aba7539b9= 60ec0607ca24/test.c.299r.sched1 > > This is a log of the pass 'sched1' When -mtune=3Drocket but issue_rate > > =3D=3D 2. > >=20 > > The pipeline is: > > ;; | insn | prio | > > ;; |=C2=A0 17=C2=A0 |=C2=A0 3=C2=A0=C2=A0 | r142=3Da0 alu > > ;; |=C2=A0 14=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | clobber r136 nothing > > ;; |=C2=A0 13=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | clobber a0 nothing > > ;; |=C2=A0 18=C2=A0 |=C2=A0 2=C2=A0=C2=A0 | r143=3Da1 alu > > ... > > ;; |=C2=A0 12=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | a0=3Dr136 alu > > ;; |=C2=A0 15=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | use a0 nothing > >=20 > > In this log, insn 13 and 14 are much ahead of schedule, which risks > > generating > > redundant mov instructions, which seems unreasonable. > >=20 > > Therefore, I submit patch again on the basis of the last review > > opinions to try to solve this problem. > >=20 > > https://github.com/majin2020/gcc-test/commit/efcb43e3369e771bde70295504= 8bfe3f501263dd#diff-805031b1be5092a2322852a248d0b0f92eef7cad5784a8209f4dfc6= 221407457L189 > > This is the diff log of shed1 after patch is added. > >=20 > > The new pipeline is: > > ;; | insn | prio | > > ;; |=C2=A0 17=C2=A0 |=C2=A0 3=C2=A0=C2=A0 | r142=3Da0 alu > > ... > > ;; |=C2=A0 10=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | [r144]=3Dr141 alu > > ;; |=C2=A0 13=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | clobber a0 nothing > > ;; |=C2=A0 14=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | clobber r136 nothing > > ;; |=C2=A0 12=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | a0=3Dr136 alu > > ;; |=C2=A0 15=C2=A0 |=C2=A0 0=C2=A0=C2=A0 | use a0 nothing > >=20 > > gcc/ChangeLog: > > * haifa-sched.cc (use_or_clobber_starts_range_p): New. > > (prune_ready_list): USE or CLOBBER should delay execution > > if it starts a new live range. > OK for the trunk.=C2=A0 It doesn't look like you have write access and I= =20 > don't see anything about what testing was done.=C2=A0 Standard practice i= s > to=20 > do a bootstrap and regression test on a primary platform such as x86,=20 > aarch64, ppc64. >=20 > I went ahead and did a bootstrap and regression test on x86_64, then=20 > pushed this to the trunk. Unfortunately this patch has triggered a bootstrap comparison failure on loongarch64-linux-gnu:=C2=A0https://gcc.gnu.org/PR112497. --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University