From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id CD19B385E836; Thu,  9 May 2024 02:44:40 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CD19B385E836
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1715222680;
	bh=JufJhkhhnOpHk8mM+O9ivmcNcn0P7Wdw8POallOpOPQ=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=MFOqKLvTYril+JtU5B4WDr7pnn0WRSDm1RNJLem19FulW5S81lGjNp7KLvX/O6tOR
	 E74jKtcp+TLfSozQhNS/3ZrslKieyq+dV5X1zlGXfD4MkA72ta8tRhtuYKhcxFj75p
	 LQ0iwATc2fSFRQcFcepdLcxdBRcb0GOYAWSgaE+k=
From: "chenglulu at loongson dot cn" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28%
 regressions on Loongarch64 after gcc 14 snapshot 20240317
Date: Thu, 09 May 2024 02:44:40 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: needs-bisection
X-Bugzilla-Severity: normal
X-Bugzilla-Who: chenglulu at loongson dot cn
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 14.2
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-114978-4-GKOFWNHebi@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-114978-4@http.gcc.gnu.org/bugzilla/>
References: <bug-114978-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114978

--- Comment #8 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Chen Chen from comment #0)
> We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in Li=
nux
> operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we
> found the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered
> significant regressions from 14% to 28% with various compiling options.
>=20
> The rate-1 results are following:
>=20
/* snip */
>=20
> after snapshot 20240317 score 18-23.1% lower with parameters "-g -Ofast
> -march=3Dla664":=20=20=20=20=20=20=20
> 13.2.0:    "-march=3Dla664" flag is not supported
> 20240317:  11.5 (227s)
> 20240324:  8.84 (296s)
> 20240430:  9.43 (278s)
> 14.1.0:    9.42 (278s)
>=20
/* snip */
>=20
>=20
> after snapshot 20240317 score 26.3-26.6% lower with parameters "-g -Ofast
> -march=3Dla464":=20=20=20=20=20=20=20
> 13.2.0:    8.76 (299s)
> 20240317:  12.8 (205s)
> 20240324:  9.39 (279s)
> 20240430:  9.43 (278s)
> 14.1.0:    9.43 (278s)
>=20
>=20

> 20240317:  11.5 (227s) -march=3Dla664
> 20240317:  12.8 (205s) -march=3Dla464
I looked for the reason for the gap between the above two results. The
performance regression is caused by r14-6814. If the following modifications
are made, the scores of -march=3Dla664 and -march464 will be the same.
diff --git a/gcc/config/loongarch/loongarch-def.cc
b/gcc/config/loongarch/loongarch-def.cc
index e8c129ce643..f27284cb20a 100644
--- a/gcc/config/loongarch/loongarch-def.cc
+++ b/gcc/config/loongarch/loongarch-def.cc
@@ -111,11 +111,7 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
  tune targets (i.e. -mtune=3Dnative while PRID does not correspond to
  any known "-mtune" type).  */
 array_tune<loongarch_rtx_cost_data> loongarch_cpu_rtx_cost_data =3D
-  array_tune<loongarch_rtx_cost_data> ()
-    .set (CPU_LA664,
-         loongarch_rtx_cost_data ()
-           .movcf2gr_ (COSTS_N_INSNS (1))
-           .movgr2cf_ (COSTS_N_INSNS (1)));
+  array_tune<loongarch_rtx_cost_data> ();=