public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch, fortran] Set default inline matmul limit to 30
@ 2017-02-25 13:29 Thomas Koenig
  2017-02-25 17:17 ` Steve Kargl
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Koenig @ 2017-02-25 13:29 UTC (permalink / raw)
  To: fortran, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1091 bytes --]

Hello world,

there still was one piece missing for the new matmul library
version.  To make sure that users (usually) benefit, we need
to call the library by default up from a certain limit.
The attached patch does that, with a limit of 30, which seems
to be reasonable given a few benchmarks.

Some test cases had to be changed to scan the optimized tree
instead of the original because the version still had some
if (0) statement in them.

Regeression-tested. OK for trunk?

Regards

	Thomas

2017-02-25  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/51119
         * options.c (gfc_post_options):  Set default limit for matmul
         inlining to 30.
         * invoke.texi:  Document change.

2017-02-25  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/51119
         * gfortran.dg/inline_matmul_1.f90:  Scan optimized dump instead
         of original.
         * gfortran.dg/inline_matmul_11.f90:  Likewise.
         * gfortran.dg/inline_matmul_9.f90:  Likewise.
         * gfortran.dg/matmul_13.f90:  New test.
         * gfortran.dg/matmul_14.f90:  New test.

[-- Attachment #2: p1-flag.diff --]
[-- Type: text/x-patch, Size: 3746 bytes --]

Index: fortran/invoke.texi
===================================================================
--- fortran/invoke.texi	(Revision 245564)
+++ fortran/invoke.texi	(Arbeitskopie)
@@ -1630,7 +1630,7 @@ square, the size comparison is performed using the
 the dimensions of the argument and result matrices.
 
 The default value for @var{n} is the value specified for
-@code{-fblas-matmul-limit} if this option is specified, or unlimitited
+@code{-fblas-matmul-limit} if this option is specified, or 30
 otherwise.
 
 @item -frecursive
Index: fortran/options.c
===================================================================
--- fortran/options.c	(Revision 245564)
+++ fortran/options.c	(Arbeitskopie)
@@ -388,10 +388,16 @@ gfc_post_options (const char **pfilename)
   if (!flag_automatic)
     flag_max_stack_var_size = 0;
   
-  /* If we call BLAS directly, only inline up to the BLAS limit.  */
+  /* If the user did not specify an inline matmul limit, inline up to the BLAS
+     limit or up to 30 if no external BLAS is specified.  */
 
-  if (flag_external_blas && flag_inline_matmul_limit < 0)
-    flag_inline_matmul_limit = flag_blas_matmul_limit;
+  if (flag_inline_matmul_limit < 0)
+    {
+      if (flag_external_blas)
+	flag_inline_matmul_limit = flag_blas_matmul_limit;
+      else
+	flag_inline_matmul_limit = 30;
+    }
 
   /* Optimization implies front end optimization, unless the user
      specified it directly.  */
Index: testsuite/gfortran.dg/inline_matmul_1.f90
===================================================================
--- testsuite/gfortran.dg/inline_matmul_1.f90	(Revision 245564)
+++ testsuite/gfortran.dg/inline_matmul_1.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do  run }
-! { dg-options "-ffrontend-optimize -fdump-tree-original -Wrealloc-lhs" }
+! { dg-options "-ffrontend-optimize -fdump-tree-optimized -Wrealloc-lhs" }
 ! PR 37131 - check basic functionality of inlined matmul, making
 ! sure that the library is not called, with and without reallocation.
 
@@ -149,4 +149,4 @@ program main
 
 end program main
 
-! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "optimized" } }
Index: testsuite/gfortran.dg/inline_matmul_11.f90
===================================================================
--- testsuite/gfortran.dg/inline_matmul_11.f90	(Revision 245564)
+++ testsuite/gfortran.dg/inline_matmul_11.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do  run }
-! { dg-additional-options "-ffrontend-optimize -fdump-tree-original" }
+! { dg-additional-options "-ffrontend-optimize -fdump-tree-optimized" }
 ! PR fortran/66176 - inline conjg for matml.
 program main
   complex, dimension(3,2) :: a
@@ -29,4 +29,4 @@ program main
   c = matmul(conjg(a), b)
   if (any(conjg(c) /= res2)) call abort
 end program main
-! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "optimized" } }
Index: testsuite/gfortran.dg/inline_matmul_9.f90
===================================================================
--- testsuite/gfortran.dg/inline_matmul_9.f90	(Revision 245564)
+++ testsuite/gfortran.dg/inline_matmul_9.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do  run }
-! { dg-options "-ffrontend-optimize -fdump-tree-original" }
+! { dg-options "-ffrontend-optimize -fdump-tree-optimized" }
 ! PR 66041 - this used to ICE with an incomplete fix for the PR.
 program main
   implicit none
@@ -21,4 +21,4 @@ program main
   if (any (c2-reshape([248., -749.],shape(c2)) /= 0.)) call abort
 end program main
 
-! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "optimized" } }

[-- Attachment #3: matmul_13.f90 --]
[-- Type: text/x-fortran, Size: 385 bytes --]

! { dg-do compile }
! { dg-options "-O3 -fdump-tree-optimized" }
! Check that the default limit of 30 for inlining matmul applies.
program main
  integer, parameter :: n = 31
  real, dimension(n,n) :: a, b, c
  call random_number(a)
  call random_number(b)
  c = matmul(a,b)
  print *,sum(c)
end program main
! { dg-final { scan-tree-dump-times "_gfortran_matmul_r4" 1 "optimized" } }

[-- Attachment #4: matmul_14.f90 --]
[-- Type: text/x-fortran, Size: 385 bytes --]

! { dg-do compile }
! { dg-options "-O3 -fdump-tree-optimized" }
! Check that the default limit of 30 for inlining matmul applies.
program main
  integer, parameter :: n = 30
  real, dimension(n,n) :: a, b, c
  call random_number(a)
  call random_number(b)
  c = matmul(a,b)
  print *,sum(c)
end program main
! { dg-final { scan-tree-dump-times "_gfortran_matmul_r4" 0 "optimized" } }

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch, fortran] Set default inline matmul limit to 30
  2017-02-25 13:29 [patch, fortran] Set default inline matmul limit to 30 Thomas Koenig
@ 2017-02-25 17:17 ` Steve Kargl
  0 siblings, 0 replies; 4+ messages in thread
From: Steve Kargl @ 2017-02-25 17:17 UTC (permalink / raw)
  To: Thomas Koenig; +Cc: fortran, gcc-patches

On Sat, Feb 25, 2017 at 02:29:12PM +0100, Thomas Koenig wrote:
> 
> there still was one piece missing for the new matmul library
> version.  To make sure that users (usually) benefit, we need
> to call the library by default up from a certain limit.
> The attached patch does that, with a limit of 30, which seems
> to be reasonable given a few benchmarks.
> 
> Some test cases had to be changed to scan the optimized tree
> instead of the original because the version still had some
> if (0) statement in them.
> 
> Regeression-tested. OK for trunk?

Looks ok to me with one comment below.

> Index: fortran/invoke.texi
> ===================================================================
> --- fortran/invoke.texi	(Revision 245564)
> +++ fortran/invoke.texi	(Arbeitskopie)
> @@ -1630,7 +1630,7 @@ square, the size comparison is performed using the
>  the dimensions of the argument and result matrices.
>  
>  The default value for @var{n} is the value specified for
> -@code{-fblas-matmul-limit} if this option is specified, or unlimitited
> +@code{-fblas-matmul-limit} if this option is specified, or 30
>  otherwise.

This description looks a little muddled.  I think something like

The default value for N is 30.  The @code{-fblas-matmul-limit}
can be used to change this value.

-- 
Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch, fortran] Set default inline matmul limit to 30
  2017-03-01 15:25 Dominique d'Humières
@ 2017-03-01 21:23 ` Thomas Koenig
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Koenig @ 2017-03-01 21:23 UTC (permalink / raw)
  To: Dominique d'Humières; +Cc: GCC-Fortran-ML

Hi Dominique,

I tried finding anything wrong with the test program, but could not get
it to misbehave on my Linux box (including sanitized stuff, valgrind
and other things).

Could it be that you are simply hitting a stack overflow on your system?
Stack usage has probably gone up; the recursive calls to a routine
containing matmul could exceed some limit on your system.

Regards

	Thomas

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch, fortran] Set default inline matmul limit to 30
@ 2017-03-01 15:25 Dominique d'Humières
  2017-03-01 21:23 ` Thomas Koenig
  0 siblings, 1 reply; 4+ messages in thread
From: Dominique d'Humières @ 2017-03-01 15:25 UTC (permalink / raw)
  To: Thomas Koenig; +Cc: GCC-Fortran-ML

After revision r 245745 I see the following failure with both -m32/-m64

FAIL: libgomp.fortran/strassen.f90   -O  execution test

[Book15] f90/bug% /opt/gcc/gcc7p-245744/bin/gfortran -O2 -fopenmp /opt/gcc/work/libgomp/testsuite/libgomp.fortran/strassen.f90
[Book15] f90/bug% ./a.out
 Time for matmul      =   0.738557
 Time for Strassen    =   0.560096
 Time for Strassen MP =   0.242983
[Book15] f90/bug% /opt/gcc/gcc7p-245745/bin/gfortran -O2 -fopenmp /opt/gcc/work/libgomp/testsuite/libgomp.fortran/strassen.f90
[Book15] f90/bug% ./a.out
 Time for matmul      =   0.142246

Backtrace for this error:

Program received signal SIGBUS: Access to an undefined portion of a memory object.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
a.out(40405,0x700007682000) malloc: *** mach_vm_map(size=308014541183160320) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Operating system error: Cannot allocate memory
Memory allocation failed in xmallocarray

Error termination. Backtrace:

Program received signal SIGBUS: Access to an undefined portion of a memory object.

Backtrace for this error:
#0  0x1088d8e69
#1  0x1088d9b65
#2  0x1088d9d62
#0  0x1088d8e69
#1  0x1088d8203
#2  0x7fffc2331bb9
#3  0x1088d85ba

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x1088d8e69
#1  0x1088d8203
#2  0x7fffc2331bb9
#3  0x10898f3c4
#4  0x1068cc394
#5  0x1068cd56b
#4  0x108a20541
Segmentation fault

This with

Using built-in specs.
COLLECT_GCC=gfc
COLLECT_LTO_WRAPPER=/opt/gcc/gcc7w/libexec/gcc/x86_64-apple-darwin16.4.0/7.0.1/lto-wrapper
Target: x86_64-apple-darwin16.4.0
Configured with: ../work/configure --prefix=/opt/gcc/gcc7w --enable-languages=c,c++,fortran,objc,obj-c++,ada,lto --with-gmp=/opt/mp-new --with-system-zlib --with-isl=/opt/mp-new --enable-lto --enable-plugin --with-arch=corei7 --with-cpu=corei7
Thread model: posix
gcc version 7.0.1 20170301 (experimental) [trunk revision 245806p14] (GCC) 

Do you want me to file a new PR for it?

TIA

Dominique

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-01 21:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-25 13:29 [patch, fortran] Set default inline matmul limit to 30 Thomas Koenig
2017-02-25 17:17 ` Steve Kargl
2017-03-01 15:25 Dominique d'Humières
2017-03-01 21:23 ` Thomas Koenig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).