public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
@ 2013-12-12 18:19 dominiq at lps dot ens.fr
  2013-12-13  9:28 ` [Bug tree-optimization/59487] " rguenth at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-12-12 18:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

            Bug ID: 59487
           Summary: [4.9 Regression] When compiled with -fwhole-program
                    rnflow.f90 runs up to 40% slower after r202826
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dominiq at lps dot ens.fr
                CC: burnus at gcc dot gnu.org, Ganesh.Gopalasubramanian at amd dot com,
                    rguenth at gcc dot gnu.org

When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after
r202826 (Core i7 at 2.8Ghz):

[Book15] lin/test% /opt/gcc/gcc4.9p-202828/bin/gfortran -Ofast -fwhole-program
rnflow.f90
[Book15] lin/test% time a.out > /dev/null
18.244u 0.020s 0:18.26 100.0%    0+0k 3+1io 0pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-202825/bin/gfortran -Ofast -fwhole-program
rnflow.f90
[Book15] lin/test% time a.out > /dev/null
13.022u 0.024s 0:13.04 100.0%    0+0k 4+1io 0pf+0w
[Book15] lin/test% gfcc -Ofast -fwhole-program rnflow.f90
[Book15] lin/test% time a.out > /dev/null
18.533u 0.036s 0:18.58 99.8%    0+0k 0+0io 45pf+0w
[Book15] lin/test% gfcc -Ofast rnflow.f90
[Book15] lin/test% time a.out > /dev/null
13.059u 0.020s 0:13.08 99.9%    0+0k 0+0io 0pf+0w
[Book15] lin/test% gfc -Ofast -fwhole-program rnflow.f90
[Book15] lin/test% time a.out > /dev/null
12.940u 0.028s 0:12.97 99.9%    0+0k 0+0io 14pf+0w

gfcc is r205891 and gfc r205924 with the patch for pr58721 at
http://gcc.gnu.org/ml/fortran/2013-12/msg00069.html which fixes also this
slowdown (the slowdown is ~20% on a Core2Duo at 2.5Ghz). This has been noticed
in the last comment of pr58464.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
@ 2013-12-13  9:28 ` rguenth at gcc dot gnu.org
  2013-12-13  9:51 ` burnus at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-12-13  9:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |4.9.0
   Target Milestone|---                         |4.9.0

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
r202826 was fixed later by the fix for PR58656 (rnflow regression), so your
bisection converged on a bogus revision.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
  2013-12-13  9:28 ` [Bug tree-optimization/59487] " rguenth at gcc dot gnu.org
@ 2013-12-13  9:51 ` burnus at gcc dot gnu.org
  2013-12-13 13:03 ` dominiq at lps dot ens.fr
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: burnus at gcc dot gnu.org @ 2013-12-13  9:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |58721

--- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> ---
If the draft/RFC patch for PR58721 fixes the issue then mark this PR depend on
the other one.

Additionally, the cause is then presumably the same - namely the tuning of the
builtin-expect probability in commit r203167; the change itself is okay even if
it clashes with gfortran's internal use - but that can be fixed differently as
discussed in the draft patch plus the other PR.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
  2013-12-13  9:28 ` [Bug tree-optimization/59487] " rguenth at gcc dot gnu.org
  2013-12-13  9:51 ` burnus at gcc dot gnu.org
@ 2013-12-13 13:03 ` dominiq at lps dot ens.fr
  2013-12-13 14:01 ` rguenther at suse dot de
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-12-13 13:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

--- Comment #3 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
> r202826 was fixed later by the fix for PR58656 (rnflow regression), 
> so your bisection converged on a bogus revision.

OK! I was too focused on the -fwhole-program option. The slowdown after r202826
was with/without this option. r203377 fixed the regression without the option,
but not with it:

[Book15] lin/test% /opt/gcc/gcc4.9p-202825/bin/gfortran -Ofast -fwhole-program
rnflow.f90
[Book15] lin/test% time a.out > /dev/null
12.843u 0.017s 0:12.86 99.9%    0+0k 0+0io 0pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-202825/bin/gfortran -Ofast rnflow.f90
[Book15] lin/test% time a.out > /dev/null
12.889u 0.022s 0:12.92 99.8%    0+0k 0+0io 41pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-202828/bin/gfortran -Ofast rnflow.f90
[Book15] lin/test% time a.out > /dev/null
17.891u 0.019s 0:17.91 99.9%    0+0k 0+0io 0pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-202828/bin/gfortran -Ofast -fwhole-program
rnflow.f90
[Book15] lin/test% time a.out > /dev/null
17.985u 0.021s 0:18.01 99.9%    0+0k 4+0io 41pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-203250/bin/gfortran -Ofast rnflow.f90
[Book15] lin/test% time a.out > /dev/null
17.974u 0.020s 0:17.99 100.0%    0+0k 0+0io 0pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-203250/bin/gfortran -Ofast -fwhole-program
rnflow.f90
[Book15] lin/test% time a.out > /dev/null
18.182u 0.021s 0:18.21 99.9%    0+0k 0+1io 38pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-203492/bin/gfortran -Ofast rnflow.f90
[Book15] lin/test% time a.out > /dev/null
12.856u 0.018s 0:12.87 99.9%    0+0k 0+0io 0pf+0w
[Book15] lin/test% /opt/gcc/gcc4.9p-203492/bin/gfortran -Ofast -fwhole-program
rnflow.f90
[Book15] lin/test% time a.out > /dev/null
18.253u 0.021s 0:18.28 99.9%    0+0k 0+0io 39pf+0w

AFAICT the usual incantations ('large-function-growth',
'max-inline-insns-auto', or 'builtin-expect-probability') have no visible
effect on this slowdown with -fwhole-program.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (2 preceding siblings ...)
  2013-12-13 13:03 ` dominiq at lps dot ens.fr
@ 2013-12-13 14:01 ` rguenther at suse dot de
  2013-12-13 14:14 ` dominiq at lps dot ens.fr
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenther at suse dot de @ 2013-12-13 14:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> ---
"dominiq at lps dot ens.fr" <gcc-bugzilla@gcc.gnu.org> wrote:
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487
>
>--- Comment #3 from Dominique d'Humieres <dominiq at lps dot ens.fr>
>---
>> r202826 was fixed later by the fix for PR58656 (rnflow regression), 
>> so your bisection converged on a bogus revision.
>
>OK! I was too focused on the -fwhole-program option. The slowdown after
>r202826
>was with/without this option. r203377 fixed the regression without the
>option,
>but not with it:

Still the regression must appear with a different revision.  The one you cited
has nothing to do with -fwhole-program.

Richard.

>[Book15] lin/test% /opt/gcc/gcc4.9p-202825/bin/gfortran -Ofast
>-fwhole-program
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>12.843u 0.017s 0:12.86 99.9%    0+0k 0+0io 0pf+0w
>[Book15] lin/test% /opt/gcc/gcc4.9p-202825/bin/gfortran -Ofast
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>12.889u 0.022s 0:12.92 99.8%    0+0k 0+0io 41pf+0w
>[Book15] lin/test% /opt/gcc/gcc4.9p-202828/bin/gfortran -Ofast
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>17.891u 0.019s 0:17.91 99.9%    0+0k 0+0io 0pf+0w
>[Book15] lin/test% /opt/gcc/gcc4.9p-202828/bin/gfortran -Ofast
>-fwhole-program
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>17.985u 0.021s 0:18.01 99.9%    0+0k 4+0io 41pf+0w
>[Book15] lin/test% /opt/gcc/gcc4.9p-203250/bin/gfortran -Ofast
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>17.974u 0.020s 0:17.99 100.0%    0+0k 0+0io 0pf+0w
>[Book15] lin/test% /opt/gcc/gcc4.9p-203250/bin/gfortran -Ofast
>-fwhole-program
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>18.182u 0.021s 0:18.21 99.9%    0+0k 0+1io 38pf+0w
>[Book15] lin/test% /opt/gcc/gcc4.9p-203492/bin/gfortran -Ofast
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>12.856u 0.018s 0:12.87 99.9%    0+0k 0+0io 0pf+0w
>[Book15] lin/test% /opt/gcc/gcc4.9p-203492/bin/gfortran -Ofast
>-fwhole-program
>rnflow.f90
>[Book15] lin/test% time a.out > /dev/null
>18.253u 0.021s 0:18.28 99.9%    0+0k 0+0io 39pf+0w
>
>AFAICT the usual incantations ('large-function-growth',
>'max-inline-insns-auto', or 'builtin-expect-probability') have no
>visible
>effect on this slowdown with -fwhole-program.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (3 preceding siblings ...)
  2013-12-13 14:01 ` rguenther at suse dot de
@ 2013-12-13 14:14 ` dominiq at lps dot ens.fr
  2014-02-27  9:25 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-12-13 14:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

--- Comment #5 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
> Still the regression must appear with a different revision.  
> The one you cited has nothing to do with -fwhole-program.

I see the slowdown with -fwhole-program for all the revisions I have tested
starting at r202826 up to r205891. If this has been fixed and broken again,
bissection won't give reliable results and I am not planning to test the 3000+
revisions in that range.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (4 preceding siblings ...)
  2013-12-13 14:14 ` dominiq at lps dot ens.fr
@ 2014-02-27  9:25 ` rguenth at gcc dot gnu.org
  2014-02-27 10:09 ` Ganesh.Gopalasubramanian at amd dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-02-27  9:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2014-02-27
     Ever confirmed|0                           |1

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Somebody needs to do the analysis.  On our testers (all AMD) I don't see the
regression.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (5 preceding siblings ...)
  2014-02-27  9:25 ` rguenth at gcc dot gnu.org
@ 2014-02-27 10:09 ` Ganesh.Gopalasubramanian at amd dot com
  2014-02-27 10:56 ` rguenther at suse dot de
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Ganesh.Gopalasubramanian at amd dot com @ 2014-02-27 10:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

--- Comment #7 from GGanesh <Ganesh.Gopalasubramanian at amd dot com> ---
Richard! With gcc version 4.9.0 20140224, I could see a gap between
with/without -fwhole-program. 

with -fwhole-program : time ./rnflowWhPr
real    0m26.184s
user    0m26.018s
sys     0m0.156s

without -fwhole-program: time ./rnflow
real    0m18.251s
user    0m18.061s
sys     0m0.180s


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (6 preceding siblings ...)
  2014-02-27 10:09 ` Ganesh.Gopalasubramanian at amd dot com
@ 2014-02-27 10:56 ` rguenther at suse dot de
  2014-02-27 15:35 ` Ganesh.Gopalasubramanian at amd dot com
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenther at suse dot de @ 2014-02-27 10:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 27 Feb 2014, Ganesh.Gopalasubramanian at amd dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487
> 
> --- Comment #7 from GGanesh <Ganesh.Gopalasubramanian at amd dot com> ---
> Richard! With gcc version 4.9.0 20140224, I could see a gap between
> with/without -fwhole-program. 

With what other options?

> with -fwhole-program : time ./rnflowWhPr
> real    0m26.184s
> user    0m26.018s
> sys     0m0.156s
> 
> without -fwhole-program: time ./rnflow
> real    0m18.251s
> user    0m18.061s
> sys     0m0.180s


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (7 preceding siblings ...)
  2014-02-27 10:56 ` rguenther at suse dot de
@ 2014-02-27 15:35 ` Ganesh.Gopalasubramanian at amd dot com
  2014-03-18 11:39 ` jakub at gcc dot gnu.org
  2014-03-18 14:56 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: Ganesh.Gopalasubramanian at amd dot com @ 2014-02-27 15:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

--- Comment #9 from GGanesh <Ganesh.Gopalasubramanian at amd dot com> ---
Other options are  -Ofast -funroll-loops -fpeel-loops


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (8 preceding siblings ...)
  2014-02-27 15:35 ` Ganesh.Gopalasubramanian at amd dot com
@ 2014-03-18 11:39 ` jakub at gcc dot gnu.org
  2014-03-18 14:56 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-03-18 11:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487
Bug 59487 depends on bug 58721, which changed state.

Bug 58721 Summary: [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/59487] [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826
  2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
                   ` (9 preceding siblings ...)
  2014-03-18 11:39 ` jakub at gcc dot gnu.org
@ 2014-03-18 14:56 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-03-18 14:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59487

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |FIXED

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Seems to be fixed by the fix for PR58721.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-03-18 14:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-12 18:19 [Bug tree-optimization/59487] New: [4.9 Regression] When compiled with -fwhole-program rnflow.f90 runs up to 40% slower after r202826 dominiq at lps dot ens.fr
2013-12-13  9:28 ` [Bug tree-optimization/59487] " rguenth at gcc dot gnu.org
2013-12-13  9:51 ` burnus at gcc dot gnu.org
2013-12-13 13:03 ` dominiq at lps dot ens.fr
2013-12-13 14:01 ` rguenther at suse dot de
2013-12-13 14:14 ` dominiq at lps dot ens.fr
2014-02-27  9:25 ` rguenth at gcc dot gnu.org
2014-02-27 10:09 ` Ganesh.Gopalasubramanian at amd dot com
2014-02-27 10:56 ` rguenther at suse dot de
2014-02-27 15:35 ` Ganesh.Gopalasubramanian at amd dot com
2014-03-18 11:39 ` jakub at gcc dot gnu.org
2014-03-18 14:56 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).