From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-292620-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 4537 invoked by alias); 31 Aug 2009 23:59:56 -0000
Received: (qmail 4366 invoked by uid 48); 31 Aug 2009 23:59:44 -0000
Date: Mon, 31 Aug 2009 23:59:00 -0000
Message-ID: <20090831235944.4365.qmail@sourceware.org>
X-Bugzilla-Reason: CC
References: <bug-40106-12313@http.gcc.gnu.org/bugzilla/>
Subject: [Bug middle-end/40106] Time increase for the Polyhedron test air.f90 due to bad optimization
In-Reply-To: <bug-40106-12313@http.gcc.gnu.org/bugzilla/>
Reply-To: gcc-bugzilla@gcc.gnu.org
To: gcc-bugs@gcc.gnu.org
From: "dominiq at lps dot ens dot fr" <gcc-bugzilla@gcc.gnu.org>
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2009-08/txt/msg02422.txt.bz2


------- Comment #28 from dominiq at lps dot ens dot fr  2009-08-31 23:59 -------
Following Richard Guenther's suggestion on IRC, I have tested the following
patch:

--- ../_gcc_clean/gcc/builtins.c        2009-08-31 15:07:18.000000000 +0200
+++ gcc/builtins.c      2009-09-01 01:28:09.000000000 +0200
@@ -3012,7 +3012,7 @@
       real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
       if (real_identical (&c2, &cint)
          && ((flag_unsafe_math_optimizations
-              && optimize_insn_for_speed_p ()
+              /* && optimize_insn_for_speed_p () */
               && powi_cost (n/2) <= POWI_MAX_MULTS)
              || n == 1))
        {

With it I get:

[ibook-dhum] test/dbg_air% gfc -m64 -O2 -funsafe-math-optimizations air_db.f90
[ibook-dhum] test/dbg_air% time a.out > /dev/null
4.490u 0.018s 0:04.51 99.7%     0+0k 0+3io 0pf+0w

compared to

[ibook-dhum] test/dbg_air% gfc -m64 -O2 -funsafe-math-optimizations
-fno-strict-overflow air_db.f90
[ibook-dhum] test/dbg_air% time a.out > /dev/null
4.320u 0.015s 0:04.34 99.7%     0+0k 0+0io 0pf+0w

and there is no call to pow in the assembly. I think the difference is
significant; so it seems that optimize_insn_for_speed_p () is playing some role
elsewhere in the code. Note that if I replace lines 322 and 427

            mu = mu0*(T(i,j)/t02)**1.5*(t02+110.56)/(T(i,j)+110.56)

with

            mu = mu0*sqrt((T(i,j)/t02)**3)*(t02+110.56)/(T(i,j)+110.56)

or

            mu =
mu0*sqrt((T(i,j)/t02))*(T(i,j)/t02)*(t02+110.56)/(T(i,j)+110.56)

there is no call to pow and the code is slightly faster with
-fno-strict-overflow

[ibook-dhum] test/dbg_air% gfc -m64 -O2 -funsafe-math-optimizations
-fno-strict-overflow air_db_1.f90
[ibook-dhum] test/dbg_air% time a.out > /dev/null
4.323u 0.015s 0:04.34 99.7%     0+0k 0+0io 0pf+0w
[ibook-dhum] test/dbg_air% gfc -m64 -O2 -funsafe-math-optimizations
air_db_1.f90
[ibook-dhum] test/dbg_air% time a.out > /dev/null
4.527u 0.016s 0:04.55 99.5%     0+0k 0+0io 0pf+0w

The original air.f90 compiled with -fwhole-file gives

[ibook-dhum] lin/test% gfc -m64 -O3 -ffast-math -funroll-loops
-ftree-loop-linear -fomit-frame-pointer -finline-limit=600 --param
min-vect-loop-bound=2 -fwhole-file air.f90
[ibook-dhum] lin/test% time a.out > /dev/null
8.358u 0.049s 0:08.42 99.6%     0+0k 0+8io 0pf+0w

compared to

[ibook-dhum] lin/test% gfc -m64 -O3 -ffast-math -funroll-loops
-ftree-loop-linear -fomit-frame-pointer -finline-limit=600 --param
min-vect-loop-bound=2 air.f90
[[ibook-dhum] lin/test% time a.out > /dev/null
8.273u 0.046s 0:08.32 99.8%     0+0k 0+0io 0pf+0w


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40106