public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
@ 2024-05-07 17:58 chz0808 at gmail dot com
  2024-05-07 18:12 ` [Bug fortran/114978] [14/14 regression] " sjames at gcc dot gnu.org
                   ` (21 more replies)
  0 siblings, 22 replies; 23+ messages in thread
From: chz0808 at gmail dot com @ 2024-05-07 17:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

            Bug ID: 114978
           Summary: 548.exchange2_r 14%-28% regressions on Loongarch64
                    after gcc 14 snapshot 20240317
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: chz0808 at gmail dot com
  Target Milestone: ---

We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in Linux 
operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we found
the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered significant
regressions from 14% to 28% with various compiling options.

The rate-1 results are following:

after snapshot 20240317 score 14.3-19.3% lower with parameters "-g -Ofast
-march=native":
13.2.0:    11.7 (223s) [gcc 13.2.0, system default]
20240317:  11.0 (237s) [gcc 14 snapshot 20240317]
20240324:  8.88 (295s) [gcc 14 snapshot 20240324]
20240430:  9.03 (290s) [gcc 14 snapshot 20240430, 14.1.0-RC]
14.1.0:    9.43 (278s) [gcc 14.1.0 release]

after snapshot 20240317 score 16.5-20.8% lower with parameters "-g -Ofast
-march=native -flto": 
13.2.0:    12.0 (218s)
20240317:  10.6 (248s)
20240324:  8.40 (312s)
20240430:  8.48 (309s)
14.1.0:    8.85 (296s)


after snapshot 20240317 score 18-23.1% lower with parameters "-g -Ofast
-march=la664":       
13.2.0:    "-march=la664" flag is not supported
20240317:  11.5 (227s)
20240324:  8.84 (296s)
20240430:  9.43 (278s)
14.1.0:    9.42 (278s)


after snapshot 20240317 score 20.3-21.2% lower with parameters "-g -Ofast
-march=la664 -flto": 
13.2.0:    "-march=la664" flag is not supported
20240317:  11.1 (236s)
20240324:  8.75 (299s)
20240430:  8.85 (296s)
14.1.0:    8.85 (296s)


after snapshot 20240317 score 26.3-26.6% lower with parameters "-g -Ofast
-march=la464":       
13.2.0:    8.76 (299s)
20240317:  12.8 (205s)
20240324:  9.39 (279s)
20240430:  9.43 (278s)
14.1.0:    9.43 (278s)


after snapshot 20240317 score 26.6-28% lower with parameters "-g -Ofast
-march=la464 -flto": 
13.2.0:    8.52 (307s)
20240317:  12.8 (204s)
20240324:  9.22 (284s)
20240430:  9.37 (280s)
14.1.0:    9.40 (279s)


The gcc 14 snapshots and gcc 14.1.0 are compiled with the following parameters: 

--enable-shared --enable-threads=posix --with-system-zlib
--enable-gnu-indirect-function --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
--disable-libssp --enable-gnu-unique-object --enable-linker-build-id
--enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
--disable-werror --enable-pie --enable-checking=release
--enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
--enable-default-pie --enable-default-ssp --enable-bootstrap
--enable-languages=c,c++,fortran,lto --with-abi=lp64d --with-arch=loongarch64
--with-tune=la664 --build=loongarch64-aosc-linux-gnu


The regression may be found on other types of CPUs as well. We did a quick test
on AMD Zen4 CPU R9 7940HS and found similar but smaller regression:

The rate-1 results on x86_64 (AMD R9 7940HS) with operating system Debian 12:

after snapshot 20240317 score 8.6-9.6% lower with parameters "-m64 -g -Ofast
-march=znver3":
12.2.0:    30.1 (87.0s) [gcc 12.2.0, system default]
13.2.0:    30.6 (85.7s) [gcc 13.2 release]
20240317:  31.4 (83.3s) [gcc 14 snapshot]
20240324:  28.7 (91.2s) [gcc 14 snapshot]
20240430:  28.4 (92.2s) [gcc 14 snapshot, 14.1.0-RC]

after snapshot 20240317 score 10% lower with parameters "-m64 -g -Ofast
-march=znver3 -flto":
12.2.0:    29.0 (90.3s) 
13.2.0:    30.9 (84.9s) 
20240317:  32.0 (81.8s) 
20240324:  28.8 (90.9s) 
20240430:  28.8 (91.1s)

gcc13 and gcc14 are compiled with the following parameters:

--enable-shared --enable-threads=posix --with-system-zlib
--enable-gnu-indirect-function --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
--disable-libssp --enable-gnu-unique-object --enable-linker-build-id
--enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
--disable-werror --enable-pie --enable-checking=release
--enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
--enable-default-pie --enable-default-ssp --enable-bootstrap
--enable-languages=c,c++,fortran,lto  --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug fortran/114978] [14/14 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
@ 2024-05-07 18:12 ` sjames at gcc dot gnu.org
  2024-05-07 18:13 ` xry111 at gcc dot gnu.org
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: sjames at gcc dot gnu.org @ 2024-05-07 18:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

Sam James <sjames at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xry111 at gcc dot gnu.org
            Summary|548.exchange2_r 14%-28%     |[14/14 regression]
                   |regressions on Loongarch64  |548.exchange2_r 14%-28%
                   |after gcc 14 snapshot       |regressions on Loongarch64
                   |20240317                    |after gcc 14 snapshot
                   |                            |20240317

--- Comment #1 from Sam James <sjames at gcc dot gnu.org> ---
Neither of the two loong maintainers seem to have accounts on BZ (?) so CCIng
xry111. Apologies if I missed their accounts.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug fortran/114978] [14/14 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
  2024-05-07 18:12 ` [Bug fortran/114978] [14/14 regression] " sjames at gcc dot gnu.org
@ 2024-05-07 18:13 ` xry111 at gcc dot gnu.org
  2024-05-07 18:17 ` [Bug target/114978] " xry111 at gcc dot gnu.org
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-05-07 18:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needs-bisection

--- Comment #2 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
I don't have a SPEC access so I cannot confirm or dis-confirm the issue.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/14 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
  2024-05-07 18:12 ` [Bug fortran/114978] [14/14 regression] " sjames at gcc dot gnu.org
  2024-05-07 18:13 ` xry111 at gcc dot gnu.org
@ 2024-05-07 18:17 ` xry111 at gcc dot gnu.org
  2024-05-07 18:18 ` xry111 at gcc dot gnu.org
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-05-07 18:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |loongarch64-*-*
          Component|fortran                     |target

--- Comment #3 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Changed component to target for now.

I'm suspicious about the 10% regression on x86_64.  IIRC there are already
multiple bug reports complaining some 5% SPEC regression on x86_64, so I'll be
really surprised if there is really a 10% regression on x86_64 but it's not
already reported.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/14 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (2 preceding siblings ...)
  2024-05-07 18:17 ` [Bug target/114978] " xry111 at gcc dot gnu.org
@ 2024-05-07 18:18 ` xry111 at gcc dot gnu.org
  2024-05-08  1:41 ` [Bug target/114978] [14/15 " chenglulu at loongson dot cn
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-05-07 18:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #4 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
s/suspicious/skeptical/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (3 preceding siblings ...)
  2024-05-07 18:18 ` xry111 at gcc dot gnu.org
@ 2024-05-08  1:41 ` chenglulu at loongson dot cn
  2024-05-08  8:20 ` rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-08  1:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #5 from chenglulu <chenglulu at loongson dot cn> ---
I will verify it on multiple machines to see if the problem can be reproduced.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (4 preceding siblings ...)
  2024-05-08  1:41 ` [Bug target/114978] [14/15 " chenglulu at loongson dot cn
@ 2024-05-08  8:20 ` rguenth at gcc dot gnu.org
  2024-05-08 14:41 ` chz0808 at gmail dot com
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-05-08  8:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |14.2

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=471.407.0

shows a recent improvement that then regressed again, maybe you have a similar
artifact with the choosing of the snapshots.  Try a snapshot from february for
comparison for example.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (5 preceding siblings ...)
  2024-05-08  8:20 ` rguenth at gcc dot gnu.org
@ 2024-05-08 14:41 ` chz0808 at gmail dot com
  2024-05-09  2:44 ` chenglulu at loongson dot cn
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chz0808 at gmail dot com @ 2024-05-08 14:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #7 from Chen Chen <chz0808 at gmail dot com> ---
(In reply to Richard Biener from comment #6)
> https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=471.407.0
> 
> shows a recent improvement that then regressed again, maybe you have a
> similar artifact with the choosing of the snapshots.  Try a snapshot from
> february for comparison for example.

Thanks for the nice graph!

We did more tests on AMD Zen4 CPU R9 7940HS (system: Debian 12) and used
similar parameters as yours. The results are following:

with parameters "-m64 -g -Ofast -march=native":
13.2.0:    30.0 (87.2s) [gcc 13.2 release]
20240121:  29.8 (88.0s) [gcc 14 snapshot]
20240218:  29.8 (88.0s) [gcc 14 snapshot]
20240303:  29.2 (89.8s) [gcc 14 snapshot]
20240310:  31.7 (82.6s) [gcc 14 snapshot]
20240317:  31.7 (82.7s) [gcc 14 snapshot]
20240324:  28.3 (92.5s) [gcc 14 snapshot]
20240430:  28.4 (92.3s) [gcc 14 snapshot, 14.1.0-RC]
14.1.0:    28.4 (92.4s) [gcc 14.1 release]

with parameters "-m64 -g -Ofast -march=native -flto":
13.2.0:    30.5 (85.8s) [gcc 13.2 release]
20240121:  30.5 (85.9s) [gcc 14 snapshot]
20240218:  29.5 (88.7s) [gcc 14 snapshot]
20240303:  30.5 (86.0s) [gcc 14 snapshot]
20240310:  31.6 (82.8s) [gcc 14 snapshot]
20240317:  31.7 (82.7s) [gcc 14 snapshot]
20240324:  28.6 (91.6s) [gcc 14 snapshot]
20240430:  29.1 (89.9s) [gcc 14 snapshot, 14.1.0-RC]
14.1.0:    29.1 (90.1s) [gcc 14.1 release]

The scores with gcc 14.1 release are 8.2-10.4% lower than those with gcc 14
snapshot 20240317, and 4.6-5.3% lower than those with gcc 13.2 release.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (6 preceding siblings ...)
  2024-05-08 14:41 ` chz0808 at gmail dot com
@ 2024-05-09  2:44 ` chenglulu at loongson dot cn
  2024-05-09  3:36 ` xry111 at gcc dot gnu.org
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-09  2:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #8 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Chen Chen from comment #0)
> We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in Linux
> operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we
> found the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered
> significant regressions from 14% to 28% with various compiling options.
> 
> The rate-1 results are following:
> 
/* snip */
> 
> after snapshot 20240317 score 18-23.1% lower with parameters "-g -Ofast
> -march=la664":       
> 13.2.0:    "-march=la664" flag is not supported
> 20240317:  11.5 (227s)
> 20240324:  8.84 (296s)
> 20240430:  9.43 (278s)
> 14.1.0:    9.42 (278s)
> 
/* snip */
> 
> 
> after snapshot 20240317 score 26.3-26.6% lower with parameters "-g -Ofast
> -march=la464":       
> 13.2.0:    8.76 (299s)
> 20240317:  12.8 (205s)
> 20240324:  9.39 (279s)
> 20240430:  9.43 (278s)
> 14.1.0:    9.43 (278s)
> 
> 

> 20240317:  11.5 (227s) -march=la664
> 20240317:  12.8 (205s) -march=la464
I looked for the reason for the gap between the above two results. The
performance regression is caused by r14-6814. If the following modifications
are made, the scores of -march=la664 and -march464 will be the same.

diff --git a/gcc/config/loongarch/loongarch-def.cc
b/gcc/config/loongarch/loongarch-def.cc
index e8c129ce643..f27284cb20a 100644
--- a/gcc/config/loongarch/loongarch-def.cc
+++ b/gcc/config/loongarch/loongarch-def.cc
@@ -111,11 +111,7 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
  tune targets (i.e. -mtune=native while PRID does not correspond to
  any known "-mtune" type).  */
 array_tune<loongarch_rtx_cost_data> loongarch_cpu_rtx_cost_data =
-  array_tune<loongarch_rtx_cost_data> ()
-    .set (CPU_LA664,
-         loongarch_rtx_cost_data ()
-           .movcf2gr_ (COSTS_N_INSNS (1))
-           .movgr2cf_ (COSTS_N_INSNS (1)));
+  array_tune<loongarch_rtx_cost_data> ();

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (7 preceding siblings ...)
  2024-05-09  2:44 ` chenglulu at loongson dot cn
@ 2024-05-09  3:36 ` xry111 at gcc dot gnu.org
  2024-05-09  3:40 ` chenglulu at loongson dot cn
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-05-09  3:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #9 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to chenglulu from comment #8)

> diff --git a/gcc/config/loongarch/loongarch-def.cc
> b/gcc/config/loongarch/loongarch-def.cc
> index e8c129ce643..f27284cb20a 100644
> --- a/gcc/config/loongarch/loongarch-def.cc
> +++ b/gcc/config/loongarch/loongarch-def.cc
> @@ -111,11 +111,7 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
>   tune targets (i.e. -mtune=native while PRID does not correspond to
>   any known "-mtune" type).  */
>  array_tune<loongarch_rtx_cost_data> loongarch_cpu_rtx_cost_data =
> -  array_tune<loongarch_rtx_cost_data> ()
> -    .set (CPU_LA664,
> -         loongarch_rtx_cost_data ()
> -           .movcf2gr_ (COSTS_N_INSNS (1))
> -           .movgr2cf_ (COSTS_N_INSNS (1)));
> +  array_tune<loongarch_rtx_cost_data> ();

But why?  Isn't movcf2gr and movgr2cf one-cycle on LA664?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (8 preceding siblings ...)
  2024-05-09  3:36 ` xry111 at gcc dot gnu.org
@ 2024-05-09  3:40 ` chenglulu at loongson dot cn
  2024-05-09 11:09 ` chenglulu at loongson dot cn
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-09  3:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #10 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Xi Ruoyao from comment #9)
> (In reply to chenglulu from comment #8)
> 
> > diff --git a/gcc/config/loongarch/loongarch-def.cc
> > b/gcc/config/loongarch/loongarch-def.cc
> > index e8c129ce643..f27284cb20a 100644
> > --- a/gcc/config/loongarch/loongarch-def.cc
> > +++ b/gcc/config/loongarch/loongarch-def.cc
> > @@ -111,11 +111,7 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
> >   tune targets (i.e. -mtune=native while PRID does not correspond to
> >   any known "-mtune" type).  */
> >  array_tune<loongarch_rtx_cost_data> loongarch_cpu_rtx_cost_data =
> > -  array_tune<loongarch_rtx_cost_data> ()
> > -    .set (CPU_LA664,
> > -         loongarch_rtx_cost_data ()
> > -           .movcf2gr_ (COSTS_N_INSNS (1))
> > -           .movgr2cf_ (COSTS_N_INSNS (1)));
> > +  array_tune<loongarch_rtx_cost_data> ();
> 
> But why?  Isn't movcf2gr and movgr2cf one-cycle on LA664?

I think this is weird too. I'm still testing other situations, and I'll find
out the reason after the testing is completed.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (9 preceding siblings ...)
  2024-05-09  3:40 ` chenglulu at loongson dot cn
@ 2024-05-09 11:09 ` chenglulu at loongson dot cn
  2024-05-09 12:57 ` chz0808 at gmail dot com
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-09 11:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #11 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Chen Chen from comment #0)
> We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in Linux
> operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we
> found the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered
> significant regressions from 14% to 28% with various compiling options.
> 
> The rate-1 results are following:
> 
> after snapshot 20240317 score 14.3-19.3% lower with parameters "-g -Ofast
> -march=native":
> 13.2.0:    11.7 (223s) [gcc 13.2.0, system default]
Hi:

 I can't reproduce the score of r13.2. Have you made any modifications there?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (10 preceding siblings ...)
  2024-05-09 11:09 ` chenglulu at loongson dot cn
@ 2024-05-09 12:57 ` chz0808 at gmail dot com
  2024-05-09 13:09 ` xry111 at gcc dot gnu.org
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chz0808 at gmail dot com @ 2024-05-09 12:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #12 from Chen Chen <chz0808 at gmail dot com> ---
(In reply to chenglulu from comment #11)
> (In reply to Chen Chen from comment #0)
> > We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in Linux
> > operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we
> > found the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered
> > significant regressions from 14% to 28% with various compiling options.
> > 
> > The rate-1 results are following:
> > 
> > after snapshot 20240317 score 14.3-19.3% lower with parameters "-g -Ofast
> > -march=native":
> > 13.2.0:    11.7 (223s) [gcc 13.2.0, system default]
> Hi:
> 
>  I can't reproduce the score of r13.2. Have you made any modifications there?

No. I used system default gcc. How big is the difference? A little fluctuation
is normal. I once got scores 11.3(232s)、11.5(227s) with parameters "-g -Ofast
-march=native" in previous tests too. To be fair, in each test presented above
I always free page cache by the command "echo 3 > /proc/sys/vm/drop_caches" and
then run the test.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (11 preceding siblings ...)
  2024-05-09 12:57 ` chz0808 at gmail dot com
@ 2024-05-09 13:09 ` xry111 at gcc dot gnu.org
  2024-05-10  3:12 ` chz0808 at gmail dot com
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-05-09 13:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #13 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Chen Chen from comment #12)

> No. I used system default gcc.

AOSC backports *many* changes not in upstream GCC 13.2 to their "13.2":
https://github.com/AOSC-Dev/aosc-os-abbs/tree/stable/core-devel/gcc/01-runtime/patches

So the default GCC is simply not GCC 13.2.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (12 preceding siblings ...)
  2024-05-09 13:09 ` xry111 at gcc dot gnu.org
@ 2024-05-10  3:12 ` chz0808 at gmail dot com
  2024-05-10  3:15 ` chenglulu at loongson dot cn
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chz0808 at gmail dot com @ 2024-05-10  3:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #14 from Chen Chen <chz0808 at gmail dot com> ---
(In reply to Xi Ruoyao from comment #13)
> (In reply to Chen Chen from comment #12)
> 
> > No. I used system default gcc.
> 
> AOSC backports *many* changes not in upstream GCC 13.2 to their "13.2":
> https://github.com/AOSC-Dev/aosc-os-abbs/tree/stable/core-devel/gcc/01-
> runtime/patches
> 
> So the default GCC is simply not GCC 13.2.

You are correct. The above 13.2 results should be "AOSC system default gcc
13.2" results. Under AOSC system I recompiled official gcc 13.2 source with the
same parameters except for "--with-tune=la664" (changed to "--with-tune=la464"
since gcc 13.2 does not support "LA664" architecture). The test results from
official gcc 13.2 are following:

-g -Ofast -march=native      : 6.54 (400s)
-g -Ofast -march=native -flto: 6.57 (399s)
-g -Ofast -march=la464       : 6.46 (405s)
-g -Ofast -march=la464 -flto : 6.57 (399s)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (13 preceding siblings ...)
  2024-05-10  3:12 ` chz0808 at gmail dot com
@ 2024-05-10  3:15 ` chenglulu at loongson dot cn
  2024-05-15  1:24 ` chenglulu at loongson dot cn
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-10  3:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #15 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Chen Chen from comment #14)
> (In reply to Xi Ruoyao from comment #13)
> > (In reply to Chen Chen from comment #12)
> > 
> > > No. I used system default gcc.
> > 
> > AOSC backports *many* changes not in upstream GCC 13.2 to their "13.2":
> > https://github.com/AOSC-Dev/aosc-os-abbs/tree/stable/core-devel/gcc/01-
> > runtime/patches
> > 
> > So the default GCC is simply not GCC 13.2.
> 
> You are correct. The above 13.2 results should be "AOSC system default gcc
> 13.2" results. Under AOSC system I recompiled official gcc 13.2 source with
> the same parameters except for "--with-tune=la664" (changed to
> "--with-tune=la464" since gcc 13.2 does not support "LA664" architecture).
> The test results from official gcc 13.2 are following:
> 
> -g -Ofast -march=native      : 6.54 (400s)
> -g -Ofast -march=native -flto: 6.57 (399s)
> -g -Ofast -march=la464       : 6.46 (405s)
> -g -Ofast -march=la464 -flto : 6.57 (399s)

The data of r13.2 I tested is similar to this. I am currently testing gcc with
the AOSC patch.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (14 preceding siblings ...)
  2024-05-10  3:15 ` chenglulu at loongson dot cn
@ 2024-05-15  1:24 ` chenglulu at loongson dot cn
  2024-05-15  3:41 ` xry111 at gcc dot gnu.org
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-15  1:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #16 from chenglulu <chenglulu at loongson dot cn> ---
The performance degradation on LoongArch is caused by one commit:

commit e0e9499aeffdaca88f0f29334384aa5f710a81a4 (HEAD -> trunk)
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Mar 19 12:24:08 2024 +0100

    tree-optimization/114151 - revert PR114074 fix

    The following reverts the chrec_fold_multiply fix and only keeps
    handling of constant overflow which keeps the original testcase
    fixed.  A better solution might involve ranger improvements or
    tracking of assumptions during SCEV analysis similar to what niter
    analysis does.

            PR tree-optimization/114151
            PR tree-optimization/114269
            PR tree-optimization/114322
            PR tree-optimization/114074
            * tree-chrec.cc (chrec_fold_multiply): Restrict the use of
            unsigned arithmetic when actual overflow on constant operands
            is observed.

            * gcc.dg/pr68317.c: Revert last change.
The scores before and after this patch are:
(-g -Ofast -march=la464)
r14-9539: 12.3
r14-9540: 9.26

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (15 preceding siblings ...)
  2024-05-15  1:24 ` chenglulu at loongson dot cn
@ 2024-05-15  3:41 ` xry111 at gcc dot gnu.org
  2024-05-15  3:57 ` chenglulu at loongson dot cn
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-05-15  3:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needs-bisection             |missed-optimization

--- Comment #17 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Strangely PR114074 is a wrong-code (instead of missed-optimization) and
reverting its fix seems improving performance for other targets...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (16 preceding siblings ...)
  2024-05-15  3:41 ` xry111 at gcc dot gnu.org
@ 2024-05-15  3:57 ` chenglulu at loongson dot cn
  2024-05-21 12:46 ` chenglulu at loongson dot cn
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-15  3:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #18 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Xi Ruoyao from comment #17)
> Strangely PR114074 is a wrong-code (instead of missed-optimization) and
> reverting its fix seems improving performance for other targets...

This is very strange. I tried turning off reg_reg addressing on the basis of
r14-9540, and the performance was not much different from r14-9539. But
unfortunately I still don’t know why

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (17 preceding siblings ...)
  2024-05-15  3:57 ` chenglulu at loongson dot cn
@ 2024-05-21 12:46 ` chenglulu at loongson dot cn
  2024-05-21 12:47 ` chenglulu at loongson dot cn
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-21 12:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #19 from chenglulu <chenglulu at loongson dot cn> ---
diff --git a/gcc/config/loongarch/loongarch.cc
b/gcc/config/loongarch/loongarch.cc
index e7835ae34ae..6a808cb0a5c 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -2383,7 +2383,7 @@ loongarch_address_insns (rtx x, machine_mode mode, bool
might_split_p)
        return factor;

       case ADDRESS_REG_REG:
-       return factor;
+       return factor * 3;

       case ADDRESS_CONST_INT:
        return lsx_p ? 0 : factor;

With this patch, -march=la464 has a score of 11.9.
However, the specific revision plan has not yet been decided.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (18 preceding siblings ...)
  2024-05-21 12:46 ` chenglulu at loongson dot cn
@ 2024-05-21 12:47 ` chenglulu at loongson dot cn
  2024-05-21 13:01 ` xry111 at gcc dot gnu.org
  2024-05-21 13:04 ` chenglulu at loongson dot cn
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-21 12:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #20 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to chenglulu from comment #19)
> diff --git a/gcc/config/loongarch/loongarch.cc
> b/gcc/config/loongarch/loongarch.cc
> index e7835ae34ae..6a808cb0a5c 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -2383,7 +2383,7 @@ loongarch_address_insns (rtx x, machine_mode mode,
> bool might_split_p)
>         return factor;
>  
>        case ADDRESS_REG_REG:
> -       return factor;
> +       return factor * 3;
>  
>        case ADDRESS_CONST_INT:
>         return lsx_p ? 0 : factor;
> 
> With this patch, -march=la464 has a score of 11.9.
> However, the specific revision plan has not yet been decided.

This is the score of R14-9540

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (19 preceding siblings ...)
  2024-05-21 12:47 ` chenglulu at loongson dot cn
@ 2024-05-21 13:01 ` xry111 at gcc dot gnu.org
  2024-05-21 13:04 ` chenglulu at loongson dot cn
  21 siblings, 0 replies; 23+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-05-21 13:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #21 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to chenglulu from comment #19)
> diff --git a/gcc/config/loongarch/loongarch.cc
> b/gcc/config/loongarch/loongarch.cc
> index e7835ae34ae..6a808cb0a5c 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -2383,7 +2383,7 @@ loongarch_address_insns (rtx x, machine_mode mode,
> bool might_split_p)
>         return factor;
>  
>        case ADDRESS_REG_REG:
> -       return factor;
> +       return factor * 3;
>  
>        case ADDRESS_CONST_INT:
>         return lsx_p ? 0 : factor;
> 
> With this patch, -march=la464 has a score of 11.9.
> However, the specific revision plan has not yet been decided.

Hmm are ldx and stx really so slow?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/114978] [14/15 regression] 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317
  2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
                   ` (20 preceding siblings ...)
  2024-05-21 13:01 ` xry111 at gcc dot gnu.org
@ 2024-05-21 13:04 ` chenglulu at loongson dot cn
  21 siblings, 0 replies; 23+ messages in thread
From: chenglulu at loongson dot cn @ 2024-05-21 13:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #22 from chenglulu <chenglulu at loongson dot cn> ---
(In reply to Xi Ruoyao from comment #21)
> (In reply to chenglulu from comment #19)
> > diff --git a/gcc/config/loongarch/loongarch.cc
> > b/gcc/config/loongarch/loongarch.cc
> > index e7835ae34ae..6a808cb0a5c 100644
> > --- a/gcc/config/loongarch/loongarch.cc
> > +++ b/gcc/config/loongarch/loongarch.cc
> > @@ -2383,7 +2383,7 @@ loongarch_address_insns (rtx x, machine_mode mode,
> > bool might_split_p)
> >         return factor;
> >  
> >        case ADDRESS_REG_REG:
> > -       return factor;
> > +       return factor * 3;
> >  
> >        case ADDRESS_CONST_INT:
> >         return lsx_p ? 0 : factor;
> > 
> > With this patch, -march=la464 has a score of 11.9.
> > However, the specific revision plan has not yet been decided.
> 
> Hmm are ldx and stx really so slow?

I think it's more like it's because LDX/STX uses an extra register.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-05-21 13:04 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-07 17:58 [Bug fortran/114978] New: 548.exchange2_r 14%-28% regressions on Loongarch64 after gcc 14 snapshot 20240317 chz0808 at gmail dot com
2024-05-07 18:12 ` [Bug fortran/114978] [14/14 regression] " sjames at gcc dot gnu.org
2024-05-07 18:13 ` xry111 at gcc dot gnu.org
2024-05-07 18:17 ` [Bug target/114978] " xry111 at gcc dot gnu.org
2024-05-07 18:18 ` xry111 at gcc dot gnu.org
2024-05-08  1:41 ` [Bug target/114978] [14/15 " chenglulu at loongson dot cn
2024-05-08  8:20 ` rguenth at gcc dot gnu.org
2024-05-08 14:41 ` chz0808 at gmail dot com
2024-05-09  2:44 ` chenglulu at loongson dot cn
2024-05-09  3:36 ` xry111 at gcc dot gnu.org
2024-05-09  3:40 ` chenglulu at loongson dot cn
2024-05-09 11:09 ` chenglulu at loongson dot cn
2024-05-09 12:57 ` chz0808 at gmail dot com
2024-05-09 13:09 ` xry111 at gcc dot gnu.org
2024-05-10  3:12 ` chz0808 at gmail dot com
2024-05-10  3:15 ` chenglulu at loongson dot cn
2024-05-15  1:24 ` chenglulu at loongson dot cn
2024-05-15  3:41 ` xry111 at gcc dot gnu.org
2024-05-15  3:57 ` chenglulu at loongson dot cn
2024-05-21 12:46 ` chenglulu at loongson dot cn
2024-05-21 12:47 ` chenglulu at loongson dot cn
2024-05-21 13:01 ` xry111 at gcc dot gnu.org
2024-05-21 13:04 ` chenglulu at loongson dot cn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).