public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* Advice with finding speed between O2 and O3
@ 2023-05-22 15:31 Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
  2023-05-22 21:36 ` Thomas Koenig
  0 siblings, 1 reply; 5+ messages in thread
From: Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] @ 2023-05-22 15:31 UTC (permalink / raw)
  To: fortran

[-- Attachment #1: Type: text/plain, Size: 2454 bytes --]

All,

Recently, one of the computing centers I run on updated their OS. And in that update, the model went from "working with GNU" to "crashing with GNU". No code change on our side, just OS.

Some experimenting later and I found that the code did run with debugging options, and it still ran with our "aggressive" options (much of which is due to Jerry DeLisle from here). Only our release flags failed. Surprising since the Aggressive options seem more likely to have issues as they are speed for speed's sake (different MPI layouts lead to different answers).

But, one of the main differences are the aggressive flags use -O2 and our release flags are -O3. So I test our release flags with -O2 and boom, works again! Bad news: much slower.

Our release flags are (essentially):

  -O3 -march=haswell -mtune=generic -funroll-loops -g -fPIC -fopenmp

so we aren't doing anything fancy (portability at the cost of speed).

Staring at the man page I saw this:

                   gcc -c -Q -O3 --help=optimizers > /tmp/O3-opts
                   gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts
                   diff /tmp/O2-opts /tmp/O3-opts | grep enabled

and when I did that I saw:

$ diff /tmp/O2-opts /tmp/O3-opts | grep enabled
>   -fgcse-after-reload               [enabled]
>   -fipa-cp-clone                    [enabled]
>   -floop-interchange                [enabled]
>   -floop-unroll-and-jam             [enabled]
>   -fpeel-loops                      [enabled]
>   -fpredictive-commoning            [enabled]
>   -fsplit-loops                     [enabled]
>   -fsplit-paths                     [enabled]
>   -ftree-loop-distribution          [enabled]
>   -ftree-partial-pre                [enabled]
>   -funroll-completely-grow-size     [enabled]
>   -funswitch-loops                  [enabled]
>   -fversion-loops-for-strides       [enabled]

Now, I'll be doing some experiments, but...that's a lot of tests and rebuilds. I was hoping maybe someone here can point me to "this flag is useful for Fortran" vs "this doesn't matter".

And maybe which one might be triggered by an OS update? ¯\_(ツ)_/¯

Thanks,
Matt
--
Matt Thompson, SSAI, Ld Scientific Programmer/Analyst
NASA GSFC,    Global Modeling and Assimilation Office
Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771
Phone: 301-614-6712                 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-25 18:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-22 15:31 Advice with finding speed between O2 and O3 Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
2023-05-22 21:36 ` Thomas Koenig
2023-05-25 16:05   ` [EXTERNAL] " Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
2023-05-25 17:01     ` Steve Kargl
2023-05-25 18:51       ` Harald Anlauf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).