public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
       [not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
@ 2020-03-30 18:21 ` jamborm at gcc dot gnu.org
  2021-09-08 15:54 ` marxin at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: jamborm at gcc dot gnu.org @ 2020-03-30 18:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2019-05-06 00:00:00         |2020-3-30
            Summary|521.wrf_r is 9.5 % slower   |521.wrf_r is 8-17% slower
                   |with PGO on Zen CPUs at     |with PGO at -Ofast and
                   |-Ofast and native           |native march/mtune
                   |march/mtune                 |

--- Comment #9 from Martin Jambor <jamborm at gcc dot gnu.org> ---
The problem still persists accross the board, causing:

- 17% regression against non-PGO on AMD Zen2 CPU,
-  8% regression against non-PGO on AMD Zen1 CPU, and
- 12% regression against non-PGO on Intel Cascade Lake server CPU.

All of the above is at -Ofast -march=native, by the way, at just -O2
(and generic -march) PGO actually helps by 25-27% on all three
systems, so I would double check before blaming specinvoke (though of
course it might be the culprit).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
       [not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
  2020-03-30 18:21 ` [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune jamborm at gcc dot gnu.org
@ 2021-09-08 15:54 ` marxin at gcc dot gnu.org
  2021-09-09  6:04 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-08 15:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org

--- Comment #10 from Martin Liška <marxin at gcc dot gnu.org> ---
All right, I understand what goes wrong. The benchmark builds 2 binaries: wrf_r
and diffwrf_521. Both of them contain pretty much the same objects that *are*
built twice:

gfortran -c -o module_mp_wsm5.fppized.o -I. -I./netcdf/include -I./inc -O2
-march=native -std=legacy -fprofile-generate -fconvert=big-endian -fno-openmp
-g0 module_mp_wsm5.fppized.f90

then wrf_r is trained, module_mp_wsm5.fppized.gcda is properly created.
But then diffwrf_521 is invoked and the GCDA if overwritten:

$ export GCOV_ERROR_FILE=/tmp/wrf.txt
...
$ grep wsm5 /tmp/wrf.txt
libgcov profiling
error:/home/marxin/Programming/cpu2017/benchspec/CPU/521.wrf_r/build/build_peak_gcc-m64.0000/module_mp_wsm5.fppized.gcda:overwriting
an existing profile data with a different timestamp

That explains why we end up with a profile that has relatively low
sum_max=4450478, as shown the profile comes from a verification binary 
 diffwrf_521.

I don't have an easy solution for that. Maybe we can somehow drop
-fprofile-generate for diffwrf_521 binary. Is it possible?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
       [not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
  2020-03-30 18:21 ` [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune jamborm at gcc dot gnu.org
  2021-09-08 15:54 ` marxin at gcc dot gnu.org
@ 2021-09-09  6:04 ` rguenth at gcc dot gnu.org
  2021-09-10  7:45 ` marxin at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-09  6:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #10)
> All right, I understand what goes wrong. The benchmark builds 2 binaries:
> wrf_r and diffwrf_521. Both of them contain pretty much the same objects
> that *are* built twice:
> 
> gfortran -c -o module_mp_wsm5.fppized.o -I. -I./netcdf/include -I./inc -O2
> -march=native -std=legacy -fprofile-generate -fconvert=big-endian
> -fno-openmp -g0 module_mp_wsm5.fppized.f90
> 
> then wrf_r is trained, module_mp_wsm5.fppized.gcda is properly created.
> But then diffwrf_521 is invoked and the GCDA if overwritten:
> 
> $ export GCOV_ERROR_FILE=/tmp/wrf.txt
> ...
> $ grep wsm5 /tmp/wrf.txt
> libgcov profiling
> error:/home/marxin/Programming/cpu2017/benchspec/CPU/521.wrf_r/build/
> build_peak_gcc-m64.0000/module_mp_wsm5.fppized.gcda:overwriting an existing
> profile data with a different timestamp
> 
> That explains why we end up with a profile that has relatively low
> sum_max=4450478, as shown the profile comes from a verification binary 
>  diffwrf_521.
> 
> I don't have an easy solution for that. Maybe we can somehow drop
> -fprofile-generate for diffwrf_521 binary. Is it possible?

Why don't we simply merge the profiles for wrf and diffwrf for shared sources?
How about adding GCOV_NO_OVERWRITE and set that for WRF so the overwrite does
not happen?  How about making the aux files have a prefix (from the
executable?) , that is, from the data gcov decides on why it cannot merge the
profiles?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
       [not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2021-09-09  6:04 ` rguenth at gcc dot gnu.org
@ 2021-09-10  7:45 ` marxin at gcc dot gnu.org
  2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
  2021-10-13 13:48 ` marxin at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-10  7:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |marxin at gcc dot gnu.org
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
       [not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2021-09-10  7:45 ` marxin at gcc dot gnu.org
@ 2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
  2021-10-13 13:48 ` marxin at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-13 13:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Martin Liska <marxin@gcc.gnu.org>:

https://gcc.gnu.org/g:72e0c742bd01f8e7e6dcca64042b9ad7e75979de

commit r12-4372-g72e0c742bd01f8e7e6dcca64042b9ad7e75979de
Author: Martin Liska <mliska@suse.cz>
Date:   Thu Sep 9 13:02:24 2021 +0200

    gcov: make profile merging smarter

    Support merging of profiles that are built from a different .o files
    but belong to the same source file. Moreover, a checksum is verified
    during profile merging and so we can safely combine such profile.

            PR gcov-profile/90364

    gcc/ChangeLog:

            * coverage.c (build_info): Emit checksum to the global variable.
            (build_info_type): Add new field for checksum.
            (coverage_obj_finish): Pass object_checksum.
            (coverage_init): Use 0 as checksum for .gcno files.
            * gcov-dump.c (dump_gcov_file): Dump also new checksum field.
            * gcov.c (read_graph_file): Read also checksum.
            * doc/invoke.texi: Document the behaviour change.

    libgcc/ChangeLog:

            * libgcov-driver.c (merge_one_data): Skip timestamp and verify
            checksums.
            (write_one_data): Write also checksum.
            * libgcov-util.c (read_gcda_file): Read also checksum field.
            * libgcov.h (struct gcov_info): Add new field.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
       [not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
@ 2021-10-13 13:48 ` marxin at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-10-13 13:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #13 from Martin Liška <marxin at gcc dot gnu.org> ---
Should be fixed now.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-13 13:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-90364-4@http.gcc.gnu.org/bugzilla/>
2020-03-30 18:21 ` [Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune jamborm at gcc dot gnu.org
2021-09-08 15:54 ` marxin at gcc dot gnu.org
2021-09-09  6:04 ` rguenth at gcc dot gnu.org
2021-09-10  7:45 ` marxin at gcc dot gnu.org
2021-10-13 13:27 ` cvs-commit at gcc dot gnu.org
2021-10-13 13:48 ` marxin at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).