* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
@ 2020-09-17 20:46 ` mailboxnotfound at yahoo dot com
2022-02-17 13:55 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: mailboxnotfound at yahoo dot com @ 2020-09-17 20:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
john henning <mailboxnotfound at yahoo dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mailboxnotfound at yahoo dot com
--- Comment #12 from john henning <mailboxnotfound at yahoo dot com> ---
I contributed to the development of benchmark 549.fotonik3d_r. The opinions
herein are my own, not necessarily SPEC's.
Martin Liška wrote in comment 9:
> adjusted tolerance for the test from 1e-10 to 1e-9
That change would have been highly desirable, if this problem had been found
prior to the release of CPU 2017. Unfortunately, post-release, it is very
difficult to change a SPEC CPU benchmark, because of the philosophy of "no
moving targets".
To be clear, a rule-compliant SPEC CPU run is not allowed to change the
tolerance.
Why wasn't the problem found before release?
Although GCC was tested prior to release of CPU 2017, the circumstances that
lead to this problem were not encountered. As Steve Ellcey wrote in Comment 5,
the problem comes and goes with various optimizations:
> -Ofast fails
> -Ofast -fno-unsafe-math-optimizations works
> -Ofast -fno-tree-loop-vectorize works
> -O3 works
In addition, it appears that the problem:
- Depends on the particular architecture. In my tests today, it disappears
when I remove -march=native
- Does not happen in the presence of FDO (Feedback-Directed Optimization,
i.e. -fprofile-generate / -fprofile-use), typically used by "peak" tuning (see
https://www.spec.org/cpu2017/Docs/overview.html#Q16 for info on base vs. peak).
SPEC CPU Examples
SPEC CPU 2017 users who start from the Example config files for GCC in
$SPEC/config are unlikely to hit the problem because most of the GCC Example
config files use -O3 (not -Ofast) for base. However, if a user modifies the
example to use
-Ofast -march=native
then they will need to also add -fno-unsafe-math-optimizations or
-fno-tree-loop-vectorize.
Working around the problem
The same-for-all rule --
https://www.spec.org/cpu2017/Docs/runrules.html#BaseFlags -- says that all
benchmarks in a suite of a given language must use the same base flags. Here
are several examples of how a config file could obey that rule while working
around the problem:
Option (a) - In base, avoid Ofast for Fortran
default=base:
COPTIMIZE = -Ofast -flto -march=native
CXXOPTIMIZE = -Ofast -flto -march=native
FOPTIMIZE = -O3 -flto -march=native
Option (b) - In base, avoid -march=native for Fortran
default=base:
COPTIMIZE = -Ofast -flto -march=native
CXXOPTIMIZE = -Ofast -flto -march=native
FOPTIMIZE = -Ofast -flto
Option (c) - Turn off tree loop vectorizer for Fortran
default=base:
OPTIMIZE = -Ofast -flto -march=native
fprate,fpspeed=base:
FOPTIMIZE = -fno-tree-loop-vectorize
Option (d) - Turn off unsafe math for Fortran
default=base:
OPTIMIZE = -Ofast -flto -march=native
fprate,fpspeed=base:
FOPTIMIZE = -fno-unsafe-math-optimizations
Performance impact
The performance impact of the options above will be system dependent, and may
depend on how hard you exercise the system (e.g. one copy or many copies).
For a particular system tested by one particular person running only one copy,
here are the results of the above 4 options, normalized to option (a):
Performance of the Fortran benchmarks in SPECrate2017 FP
Normalized to option (a). Higher is better
503. 507. 521. 527. 549. 554.
bwaves cactu wrf cam4 fotonik roms geomean
(a) O3 1.00 1.00 1.00 1.00 1.00 1.00 1.000
(b) no native 1.31 .82 1.31 1.01 .93 .98 1.045
(c) no vect 1.31 1.01 .94 .89 .90 .85 .973
(d) no unsafe .99 1.02 1.36 1.01 1.03 1.01 1.064
Given the above, at the moment option (d) seems best.
Next steps
I will add a summary of this discussion to
https://www.spec.org/cpu2017/Docs/benchmarks/549.fotonik3d_r.html
Thank you Martin, Martin, Steve, and Richard for the clarity in this report.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
2020-09-17 20:46 ` [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs mailboxnotfound at yahoo dot com
@ 2022-02-17 13:55 ` rguenth at gcc dot gnu.org
2022-02-17 15:48 ` jamborm at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 13:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
One option is to introduce a less invasive optimization option to avoid the
undesirable vectorization. For example -fno-vectorize-fp-inductions noting
this particular loop is a floating-point induction. Usually those tend not
to be performance critical.
More general -f[no-]vectorize-{in,re}ductions[={fp,int}] would be possible
as well.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
2020-09-17 20:46 ` [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs mailboxnotfound at yahoo dot com
2022-02-17 13:55 ` rguenth at gcc dot gnu.org
@ 2022-02-17 15:48 ` jamborm at gcc dot gnu.org
2022-02-18 8:22 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: jamborm at gcc dot gnu.org @ 2022-02-17 15:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
--- Comment #14 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #13)
> One option is to introduce a less invasive optimization option[...]
I think that would be useful, yes. It could even be a param if we do not want
to commit to having it forever.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2022-02-17 15:48 ` jamborm at gcc dot gnu.org
@ 2022-02-18 8:22 ` rguenth at gcc dot gnu.org
2022-03-08 11:12 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-18 8:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, let me try cooking sth up.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2022-02-18 8:22 ` rguenth at gcc dot gnu.org
@ 2022-03-08 11:12 ` rguenth at gcc dot gnu.org
2022-03-08 11:53 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-08 11:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 52582
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52582&action=edit
simple --param patch
This adds the ability to turn off floating point induction vectorization with
--param vect-induction-float=0 which should be usable ontop of -Ofast.
Can somebody with unaltered sources verify if that helps?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2022-03-08 11:12 ` rguenth at gcc dot gnu.org
@ 2022-03-08 11:53 ` rguenth at gcc dot gnu.org
2022-03-08 14:58 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-08 11:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #16)
> Created attachment 52582 [details]
> simple --param patch
>
> This adds the ability to turn off floating point induction vectorization with
> --param vect-induction-float=0 which should be usable ontop of -Ofast.
>
> Can somebody with unaltered sources verify if that helps?
I've verified it works to avoid the miscompare.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2022-03-08 11:53 ` rguenth at gcc dot gnu.org
@ 2022-03-08 14:58 ` cvs-commit at gcc dot gnu.org
2022-03-08 15:01 ` rguenth at gcc dot gnu.org
2024-01-24 16:07 ` law at gcc dot gnu.org
8 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-03-08 14:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:058d19b42ad4c4c22635f70db6913a80884aedec
commit r12-7535-g058d19b42ad4c4c22635f70db6913a80884aedec
Author: Richard Biener <rguenther@suse.de>
Date: Tue Mar 8 12:07:07 2022 +0100
tree-optimization/84201 - add --param vect-induction-float
This adds a --param to allow disabling of vectorization of
floating point inductions. Ontop of -Ofast this should allow
549.fotonik3d_r to not miscompare.
2022-03-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/84201
* params.opt (-param=vect-induction-float): Add.
* doc/invoke.texi (vect-induction-float): Document.
* tree-vect-loop.cc (vectorizable_induction): Honor
param_vect_induction_float.
* gcc.dg/vect/pr84201.c: New testcase.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2022-03-08 14:58 ` cvs-commit at gcc dot gnu.org
@ 2022-03-08 15:01 ` rguenth at gcc dot gnu.org
2024-01-24 16:07 ` law at gcc dot gnu.org
8 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-08 15:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|NEW |RESOLVED
--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
There's now --param vect-induction-float=0 available as mitigation which
disables all vectorization of floating point typed inductions.
There's nothing wrong with GCC as for the original report. Though we attempt
to make -Ofast usable for SPEC this particular vectorization isn't off the
hooks and the rounding errors are well within what you'd expect.
So, INVALID for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
[not found] <bug-84201-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2022-03-08 15:01 ` rguenth at gcc dot gnu.org
@ 2024-01-24 16:07 ` law at gcc dot gnu.org
8 siblings, 0 replies; 9+ messages in thread
From: law at gcc dot gnu.org @ 2024-01-24 16:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201
Jeffrey A. Law <law at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vineetg at gcc dot gnu.org
--- Comment #20 from Jeffrey A. Law <law at gcc dot gnu.org> ---
*** Bug 113570 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 9+ messages in thread