public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jack Howarth <howarth@bromo.med.uc.edu>
To: Mikael Morin <mikael.morin@sfr.fr>
Cc: fortran@gcc.gnu.org, GCC patches <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch, fortran] [00/66] PR fortran/43829 Inline sum	and?product (AKA scalarization of reductions)
Date: Sat, 29 Oct 2011 16:04:00 -0000	[thread overview]
Message-ID: <20111029144800.GA16100@bromo.med.uc.edu> (raw)
In-Reply-To: <201110281830.35708.mikael.morin@sfr.fr>

On Fri, Oct 28, 2011 at 06:30:35PM +0200, Mikael Morin wrote:
> On Friday 28 October 2011 15:56:36 Jack Howarth wrote:
> > Mikael,
> >     The complete patch bootstraps current FSF gcc trunk on
> > x86_64-apple-darwin11 and the resulting gfortran compiler can compile the
> > Polyhedron 2005 benchmarks using...
> > 
> > Compile Command : gfortran-fsf-4.7 -O3 -ffast-math -funroll-loops -flto
> > -fwhole-program %n.f90 -o %n
> > 
> > without runtime regressions. However I don't seem to see any particular
> > performance improvements with your patches applied. In fact, a few
> > benchmarks including nf and test_fpu seem to show slower runtimes
> > (~8-11%). Have you done any benchmarking with and without the proposed
> > patches? Jack
> 
> Not myself, but the previous versions of the patch have been reported to give 
> sensitive improvement on "tonto" here:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43829#c26
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43829#c35
> 
> Since those versions, the array constructor handling has been improved, and a 
> few mostly cosmetic changes have been applied, so I expect the posted patch to 
> be on par with the previous ones, possibly slightly better.
> 
> Now regarding your regressions, it is quite a lot worse, and quite unexpected.
> I have just looked at test_fpu.f90 and nf.f90 from a polyhedron source I have 
> found at http://www.polyhedron.com/web_images/documents/pb05.zip. 
> There is no call to product in them, and both use only single-argument sum 
> calls, which are not (or shouldn't be) impacted by my patch (scalar cases). 
> Indeed, if I compare the code produced using -fdump-tree-original, there is 
> zero difference in nf.f90, and in test_fpu.f90 only slight variations which 
> are very very unlikely to cause the regression you see (see attached diff).
> 
> Could you double check your figures, and/or that the regressions are really 
> caused by my patch?

Mikeal,
   The problem was the quick.par testing with the patch applied. Full standard.par
testing suggests that identical binaries are produced for pb05 (by size anyway)...

Using built-in specs.
COLLECT_GCC=gcc-fsf-4.7
COLLECT_LTO_WRAPPER=/sw/lib/gcc4.7/libexec/gcc/x86_64-apple-darwin11.2.0/4.7.0/lto-wrapper
Target: x86_64-apple-darwin11.2.0
Configured with: ../gcc-4.7-20111028/configure --prefix=/sw --prefix=/sw/lib/gcc4.7 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.7/info --with-build-config=bootstrap-lto --enable-stage1-languages=c,lto --enable-languages=c,c++,fortran,lto,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.7 --enable-checking=yes --enable-cloog-backend=isl
Thread model: posix
gcc version 4.7.0 20111028 (experimental) (GCC) 

prepatch at r180613

Date & Time     : 28 Oct 2011 13:47:42
Test Name       : gfortran_lin_O3_wholeprogram
Compile Command : gfortran-fsf-4.7 -O3 -ffast-math -funroll-loops -flto -fwhole-program %n.f90 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :     2000.0
Target Error %  :      0.100
Minimum Repeats :    10
Maximum Repeats :   100

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      6.75       55000      8.16      10  0.0522
      aermod    119.95     1237720     16.83      13  0.0956
         air     18.38      106960      5.77      33  0.0949
    capacita      6.48       77240     32.61      17  0.0903
     channel      2.21       34904      2.05      19  0.0493
       doduc     20.19      196496     25.98      17  0.0978
     fatigue      7.20       81616      5.98      16  0.0998
     gas_dyn     13.58      119824      4.11      44  0.0854
      induct     12.90      145096     12.86      13  0.0936
       linpk      1.90       26104     15.51      22  0.0667
        mdbx      6.52       81104     11.32      23  0.0995
          nf      6.66       71872     27.17      38  0.0891
     protein     21.47      127264     31.24      15  0.0726
      rnflow     19.51      131056     24.42      19  0.0776
    test_fpu     12.09       97272      7.89      22  0.0399
        tfft      1.63       22464      1.87      21  0.0169

Geometric Mean Execution Time =      10.54 seconds

postpatch at r180613

Date & Time     : 28 Oct 2011 16:42:27
Test Name       : gfortran_lin_O3_wholeprogram
Compile Command : gfortran-fsf-4.7 -O3 -ffast-math -funroll-loops -flto -fwhole-program %n.f90 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :     2000.0
Target Error %  :      0.100
Minimum Repeats :    10
Maximum Repeats :   100

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      6.44       55000      8.16      10  0.0304
      aermod    120.51     1237720     16.88      14  0.0968
         air     19.54      106960      5.78      16  0.0774
    capacita      6.40       77240     32.58      22  0.0796
     channel      2.16       34904      2.05      43  0.0893
       doduc     22.76      196496     25.61      18  0.0407
     fatigue      6.99       81616      5.99      16  0.0852
     gas_dyn     12.92      119824      4.08      28  0.0866
      induct     14.28      145096     12.85      12  0.0829
       linpk      1.97       26104     15.50      14  0.0722
        mdbx      6.52       81104     11.12      20  0.0151
          nf      6.44       71872     27.51      39  0.0935
     protein     20.86      127264     31.21      12  0.0603
      rnflow     20.45      131056     24.40      14  0.0828
    test_fpu     12.10       97272      7.89      24  0.0780
        tfft      1.63       22464      1.87      18  0.0878

Geometric Mean Execution Time =      10.53 seconds

> 
> Mikael

> --- test_fpu.f90.003t.original.master	2011-10-28 18:08:53.000000000 +0200
> +++ test_fpu.f90.003t.original.patched	2011-10-28 18:22:28.000000000 +0200
> @@ -1929,6 +1929,7 @@
>                        D.2297 = offset.65 + -1;
>                        atmp.64.dim[0].ubound = D.2297;
>                        pos.61 = D.2297 >= 0 ? 1 : 0;
> +                      offset.62 = 1;
>                        {
>                          integer(kind=8) S.67;
>  
> @@ -1936,7 +1937,6 @@
>                          while (1)
>                            {
>                              if (S.67 > D.2297) goto L.133;
> -                            offset.62 = 1;
>                              if (ABS_EXPR <(*(real(kind=8)[0] * restrict) atmp.64.data)[S.67]> > limit.63)
>                                {
>                                  limit.63 = ABS_EXPR <(*(real(kind=8)[0] * restrict) atmp.64.data)[S.67]>;
> @@ -2406,14 +2406,14 @@
>                            integer(kind=8) D.2457;
>                            integer(kind=8) S.104;
>  
> -                          D.2457 = D.2436 + D.2442;
> -                          D.2458 = stride.45;
> +                          D.2457 = stride.45;
> +                          D.2458 = D.2436 + D.2442;
>                            D.2459 = D.2443 * stride.45 + D.2439;
>                            S.104 = 0;
>                            while (1)
>                              {
>                                if (S.104 > D.2444) goto L.149;
> -                              (*(real(kind=8)[0:] * restrict) atmp.103.data)[S.104] = (*b)[(S.104 + D.2454) * D.2458 + D.2457];
> +                              (*(real(kind=8)[0:] * restrict) atmp.103.data)[S.104] = (*b)[(S.104 + D.2454) * D.2457 + D.2458];
>                                S.104 = S.104 + 1;
>                              }
>                            L.149:;
> @@ -2486,13 +2486,13 @@
>                            integer(kind=8) D.2479;
>                            integer(kind=8) S.106;
>  
> -                          D.2479 = D.2473 + D.2476;
> -                          D.2480 = stride.45;
> +                          D.2479 = stride.45;
> +                          D.2480 = D.2473 + D.2476;
>                            S.106 = D.2471;
>                            while (1)
>                              {
>                                if (S.106 > D.2472) goto L.152;
> -                              (*b)[(S.106 + D.2477) * D.2480 + D.2479] = (*temp)[S.106 + -1];
> +                              (*b)[(S.106 + D.2477) * D.2479 + D.2480] = (*temp)[S.106 + -1];
>                                S.106 = S.106 + 1;
>                              }
>                            L.152:;
> @@ -2756,13 +2756,13 @@
>                        integer(kind=8) D.2549;
>                        integer(kind=8) S.112;
>  
> -                      D.2549 = D.2543 + D.2546;
> -                      D.2550 = stride.45;
> +                      D.2549 = stride.45;
> +                      D.2550 = D.2543 + D.2546;
>                        S.112 = 1;
>                        while (1)
>                          {
>                            if (S.112 > D.2542) goto L.168;
> -                          (*b)[(S.112 + D.2547) * D.2550 + D.2549] = (*temp)[S.112 + -1];
> +                          (*b)[(S.112 + D.2547) * D.2549 + D.2550] = (*temp)[S.112 + -1];
>                            S.112 = S.112 + 1;
>                          }
>                        L.168:;
> @@ -2885,13 +2885,13 @@
>                        integer(kind=8) D.2582;
>                        integer(kind=8) S.115;
>  
> -                      D.2582 = D.2575 + D.2579;
> -                      D.2583 = stride.45;
> +                      D.2582 = stride.45;
> +                      D.2583 = D.2575 + D.2579;
>                        S.115 = 1;
>                        while (1)
>                          {
>                            if (S.115 > D.2578) goto L.176;
> -                          (*temp)[S.115 + -1] = (*b)[(S.115 + D.2580) * D.2583 + D.2582];
> +                          (*temp)[S.115 + -1] = (*b)[(S.115 + D.2580) * D.2582 + D.2583];
>                            S.115 = S.115 + 1;
>                          }
>                        L.176:;
> @@ -3348,6 +3348,7 @@
>                        D.2733 = (integer(kind=8)) *n;
>                        D.2734 = (integer(kind=8)) k;
>                        pos.146 = D.2732 <= D.2733 ? 1 : 0;
> +                      offset.147 = 1 - D.2732;
>                        {
>                          integer(kind=8) D.2736;
>                          integer(kind=8) S.149;
> @@ -3357,7 +3358,6 @@
>                          while (1)
>                            {
>                              if (S.149 > D.2733) goto L.191;
> -                            offset.147 = 1 - D.2732;
>                              if (ABS_EXPR <(*b)[S.149 + D.2736]> > limit.148)
>                                {
>                                  limit.148 = ABS_EXPR <(*b)[S.149 + D.2736]>;

  reply	other threads:[~2011-10-29 14:48 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-27 23:43 [Patch, fortran] [00/66] PR fortran/43829 Inline sum and product " Mikael Morin
2011-10-27 23:32 ` [Patch, fortran] [20..30/66] inline sum and product: Update core structs Mikael Morin
2011-10-27 23:32   ` [Patch, fortran] [20/66] inline sum and product: Update core structs: Rename gfc_ss_info Mikael Morin
2011-10-27 23:32   ` [Patch, fortran] [25/66] inline sum and product: Update core structs: Move string_length Mikael Morin
2011-10-27 23:32   ` [Patch, fortran] [29/66] inline sum and product: Update core structs: Move useflags flag Mikael Morin
2011-10-27 23:33   ` [Patch, fortran] [23/66] inline sum and product: Update core structs: Move type Mikael Morin
2011-10-27 23:33   ` [Patch, fortran] [26/66] inline sum and product: Update core structs: Move scalar struct Mikael Morin
2011-10-27 23:33   ` [Patch, fortran] [22/66] inline sum and product: Update core structs: Move shape Mikael Morin
2011-10-27 23:34   ` [Patch, fortran] [27/66] inline sum and product: Update core structs: Move temp struct Mikael Morin
2011-10-27 23:35   ` [Patch, fortran] [30/66] inline sum and product: Update core structs: Move where flag Mikael Morin
2011-10-27 23:35   ` [Patch, fortran] [21/66] inline sum and product: Update core structs: Move dim and dimen Mikael Morin
2011-10-27 23:35   ` [Patch, fortran] [24/66] inline sum and product: Update core structs: Move expr Mikael Morin
2011-10-27 23:43   ` [Patch, fortran] [28/66] inline sum and product: Update core structs: Move info struct Mikael Morin
2011-10-27 23:35 ` [Patch, fortran] [31..53/66] inline sum and product: Update the scalarizer Mikael Morin
2011-10-27 23:34   ` [Patch, fortran] [32/66] inline sum and product: Update the scalarizer: clear specloop in gfc_trans_create_temp_arrays Mikael Morin
2011-10-27 23:34   ` [Patch, fortran] [31/66] inline sum and product: Update the scalarizer: Split gfc_conv_loop_setup Mikael Morin
2011-10-27 23:35   ` [Patch, fortran] [47..48/66] inline sum and product: Update the scalarizer: New gfc_loopinfo::nested_loop field Mikael Morin
2011-10-27 23:35   ` [Patch, fortran] [44/66] inline sum and product: Update the scalarizer: New gfc_ss::nested_ss field Mikael Morin
2011-10-27 23:35   ` [Patch, fortran] [33/66] inline sum and product: Update the scalarizer Mikael Morin
2011-10-27 23:35   ` [Patch, fortran] [45/66] inline sum and product: Update the scalarizer: Update dimension mapping inversion functions Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [53/66] inline sum and product: Update the scalarizer: Update gfc_trans_preloop_setup Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [52/66] inline sum and product: Update the scalarizer: New outermost_loop function Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [49..51/66] inline sum and product: Update the scalarizer: New parent loop Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [46/66] inline sum and product: Update the scalarizer: Update gfc_trans_create_temp_array Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [40..43/66] inline sum and product: Update the scalarizer: New gfc_ss::parent field Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [34/66] inline sum and product: Update the scalarizer: gfc_ss_info refcounting Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [35..39/66] inline sum and product: Update the scalarizer: New gfc_ss::loop field Mikael Morin
2011-10-27 23:35 ` [Patch, fortran] [07..12/66] inline sum and product: Preliminary cleanups Mikael Morin
2011-10-27 23:31   ` [Patch, fortran] [07/66] inline sum and product: Preliminary cleanups: Useless coarray code removal Mikael Morin
2011-10-27 23:31   ` [Patch, fortran] [10/66] inline sum and product: Preliminary cleanups: Use array's instead of loop's dimensions Mikael Morin
2011-10-27 23:32   ` [Patch, fortran] [08/66] inline sum and product: Preliminary cleanups: Remove redundant condition Mikael Morin
2011-10-27 23:32   ` [Patch, fortran] [12/66] inline sum and product: Preliminary cleanups: Stop loop before end marker Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [09/66] inline sum and product: Preliminary cleanups: Assertify condition Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [11/66] inline sum and product: Preliminary cleanups: Skip temporary case Mikael Morin
2011-10-27 23:36 ` [Patch, fortran] [01..06/66] inline sum and product: Prepare gfc_trans_preloop_setup Mikael Morin
2011-10-27 23:30   ` [Patch, fortran] [01/66] " Mikael Morin
2011-10-27 23:30   ` [Patch, fortran] [06/66] " Mikael Morin
2011-10-30  9:52     ` Paul Richard Thomas
2011-10-30 21:57       ` Mikael Morin
2011-10-27 23:31   ` [Patch, fortran] [03/66] " Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [05/66] " Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [02/66] " Mikael Morin
2011-10-27 23:38   ` [Patch, fortran] [04/66] " Mikael Morin
2011-10-27 23:36 ` [Patch, fortran] [62..66/66] inline sum and product: Inline sum Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [65/66] inline sum and product: Inline sum: Change se initialization Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [63/66] inline sum and product: Inline sum: Change argument handling Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [64/66] inline sum and product: Inline sum: Change loop use Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [66/66] inline sum and product: Inline sum: The end Mikael Morin
2011-10-28  0:29   ` [Patch, fortran] [62/66] inline sum and product: Inline sum: Change conditions Mikael Morin
2011-10-28  0:02 ` [Patch, fortran] [13..19/66] inline sum and product: Interfaces changes Mikael Morin
2011-10-27 23:31   ` [Patch, fortran] [14/66] inline sum and product: Interfaces changes: gfc_trans_array_bound_check, gfc_conv_array_index_offset Mikael Morin
2011-10-27 23:32   ` [Patch, fortran] [15/66] inline sum and product: Interfaces changes: obtain name more simply Mikael Morin
2011-10-27 23:32   ` [Patch, fortran] [17/66] inline sum and product: Interfaces changes: gfc_set_vector_loop_bounds Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [13/66] inline sum and product: Interfaces changes: gfc_trans_array_constructor Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [16/66] inline sum and product: Interfaces changes: gfc_trans_create_temp_array Mikael Morin
2011-10-27 23:43   ` [Patch, fortran] [18/66] inline sum and product: Interfaces changes: get_array_ref_dim Mikael Morin
2011-10-27 23:44   ` [Patch, fortran] [19/66] inline sum and product: Interfaces changes: dim_ok Mikael Morin
2011-10-28  0:22 ` [Patch, fortran] [54..61/66] inline sum and product: Prevent regressions Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [55..56/66] inline sum and product: Prevent regressions: Fix gfc_conv_elemental_dependencies Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [57..59/66] inline sum and product: Prevent regressions: Fix {min, max}{loc, val} Mikael Morin
2011-10-27 23:36   ` [Patch, fortran] [60/66] inline sum and product: Update the scalarizer: Fix error markers Mikael Morin
2011-10-27 23:43   ` [Patch, fortran] [54/66] inline sum and product: Prevent regressions: Add dependency checking Mikael Morin
2011-10-28  0:01   ` [Patch, fortran] [61/66] inline sum and product: Prevent regressions: Disable frontend optimizations Mikael Morin
2011-10-28 14:35 ` [Patch, fortran] [00/66] PR fortran/43829 Inline sum and product (AKA scalarization of reductions) Jack Howarth
2011-10-28 17:25   ` Mikael Morin
2011-10-29 16:04     ` Jack Howarth [this message]
2011-11-01 21:33 ` Paul Richard Thomas
2011-11-04  3:51   ` Mikael Morin
2011-11-04  9:39     ` Richard Guenther

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111029144800.GA16100@bromo.med.uc.edu \
    --to=howarth@bromo.med.uc.edu \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=mikael.morin@sfr.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).