public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
[not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
@ 2020-03-30 18:21 ` jamborm at gcc dot gnu.org
2021-09-08 15:54 ` marxin at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: jamborm at gcc dot gnu.org @ 2020-03-30 18:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364
Martin Jambor <jamborm at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2019-05-06 00:00:00 |2020-3-30
Summary|521.wrf_r is 9.5 % slower |521.wrf_r is 8-17% slower
|with PGO on Zen CPUs at |with PGO at -Ofast and
|-Ofast and native |native march/mtune
|march/mtune |
--- Comment #9 from Martin Jambor <jamborm at gcc dot gnu.org> ---
The problem still persists accross the board, causing:
- 17% regression against non-PGO on AMD Zen2 CPU,
- 8% regression against non-PGO on AMD Zen1 CPU, and
- 12% regression against non-PGO on Intel Cascade Lake server CPU.
All of the above is at -Ofast -march=native, by the way, at just -O2
(and generic -march) PGO actually helps by 25-27% on all three
systems, so I would double check before blaming specinvoke (though of
course it might be the culprit).
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
[not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
2020-03-30 18:21 ` [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune jamborm at gcc dot gnu.org
@ 2021-09-08 15:54 ` marxin at gcc dot gnu.org
2021-09-09 6:04 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-08 15:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |marxin at gcc dot gnu.org
--- Comment #10 from Martin Liška <marxin at gcc dot gnu.org> ---
All right, I understand what goes wrong. The benchmark builds 2 binaries: wrf_r
and diffwrf_521. Both of them contain pretty much the same objects that *are*
built twice:
gfortran -c -o module_mp_wsm5.fppized.o -I. -I./netcdf/include -I./inc -O2
-march=native -std=legacy -fprofile-generate -fconvert=big-endian -fno-openmp
-g0 module_mp_wsm5.fppized.f90
then wrf_r is trained, module_mp_wsm5.fppized.gcda is properly created.
But then diffwrf_521 is invoked and the GCDA if overwritten:
$ export GCOV_ERROR_FILE=/tmp/wrf.txt
...
$ grep wsm5 /tmp/wrf.txt
libgcov profiling
error:/home/marxin/Programming/cpu2017/benchspec/CPU/521.wrf_r/build/build_peak_gcc-m64.0000/module_mp_wsm5.fppized.gcda:overwriting
an existing profile data with a different timestamp
That explains why we end up with a profile that has relatively low
sum_max=4450478, as shown the profile comes from a verification binary
diffwrf_521.
I don't have an easy solution for that. Maybe we can somehow drop
-fprofile-generate for diffwrf_521 binary. Is it possible?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
[not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
2020-03-30 18:21 ` [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune jamborm at gcc dot gnu.org
2021-09-08 15:54 ` marxin at gcc dot gnu.org
@ 2021-09-09 6:04 ` rguenth at gcc dot gnu.org
2021-09-10 7:45 ` marxin at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-09 6:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #10)
> All right, I understand what goes wrong. The benchmark builds 2 binaries:
> wrf_r and diffwrf_521. Both of them contain pretty much the same objects
> that *are* built twice:
>
> gfortran -c -o module_mp_wsm5.fppized.o -I. -I./netcdf/include -I./inc -O2
> -march=native -std=legacy -fprofile-generate -fconvert=big-endian
> -fno-openmp -g0 module_mp_wsm5.fppized.f90
>
> then wrf_r is trained, module_mp_wsm5.fppized.gcda is properly created.
> But then diffwrf_521 is invoked and the GCDA if overwritten:
>
> $ export GCOV_ERROR_FILE=/tmp/wrf.txt
> ...
> $ grep wsm5 /tmp/wrf.txt
> libgcov profiling
> error:/home/marxin/Programming/cpu2017/benchspec/CPU/521.wrf_r/build/
> build_peak_gcc-m64.0000/module_mp_wsm5.fppized.gcda:overwriting an existing
> profile data with a different timestamp
>
> That explains why we end up with a profile that has relatively low
> sum_max=4450478, as shown the profile comes from a verification binary
> diffwrf_521.
>
> I don't have an easy solution for that. Maybe we can somehow drop
> -fprofile-generate for diffwrf_521 binary. Is it possible?
Why don't we simply merge the profiles for wrf and diffwrf for shared sources?
How about adding GCOV_NO_OVERWRITE and set that for WRF so the overwrite does
not happen? How about making the aux files have a prefix (from the
executable?) , that is, from the data gcov decides on why it cannot merge the
profiles?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
[not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2021-09-09 6:04 ` rguenth at gcc dot gnu.org
@ 2021-09-10 7:45 ` marxin at gcc dot gnu.org
2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
2021-10-13 13:48 ` marxin at gcc dot gnu.org
5 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-10 7:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org
Target Milestone|--- |12.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
[not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2021-09-10 7:45 ` marxin at gcc dot gnu.org
@ 2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
2021-10-13 13:48 ` marxin at gcc dot gnu.org
5 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-13 13:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364
--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Martin Liska <marxin@gcc.gnu.org>:
https://gcc.gnu.org/g:72e0c742bd01f8e7e6dcca64042b9ad7e75979de
commit r12-4372-g72e0c742bd01f8e7e6dcca64042b9ad7e75979de
Author: Martin Liska <mliska@suse.cz>
Date: Thu Sep 9 13:02:24 2021 +0200
gcov: make profile merging smarter
Support merging of profiles that are built from a different .o files
but belong to the same source file. Moreover, a checksum is verified
during profile merging and so we can safely combine such profile.
PR gcov-profile/90364
gcc/ChangeLog:
* coverage.c (build_info): Emit checksum to the global variable.
(build_info_type): Add new field for checksum.
(coverage_obj_finish): Pass object_checksum.
(coverage_init): Use 0 as checksum for .gcno files.
* gcov-dump.c (dump_gcov_file): Dump also new checksum field.
* gcov.c (read_graph_file): Read also checksum.
* doc/invoke.texi: Document the behaviour change.
libgcc/ChangeLog:
* libgcov-driver.c (merge_one_data): Skip timestamp and verify
checksums.
(write_one_data): Write also checksum.
* libgcov-util.c (read_gcda_file): Read also checksum field.
* libgcov.h (struct gcov_info): Add new field.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
[not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
@ 2021-10-13 13:48 ` marxin at gcc dot gnu.org
5 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-10-13 13:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #13 from Martin Liška <marxin at gcc dot gnu.org> ---
Should be fixed now.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-13 13:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
2020-03-30 18:21 ` [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune jamborm at gcc dot gnu.org
2021-09-08 15:54 ` marxin at gcc dot gnu.org
2021-09-09 6:04 ` rguenth at gcc dot gnu.org
2021-09-10 7:45 ` marxin at gcc dot gnu.org
2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
2021-10-13 13:48 ` marxin at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).