public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch, fortran] Move some array packing to front end
@ 2019-01-22 21:10 Thomas Koenig
  2019-01-23  8:22 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Koenig @ 2019-01-22 21:10 UTC (permalink / raw)
  To: fortran, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4074 bytes --]

Hello world,

the attached patch moves the packing / unpacking of arrays to the front
end when optimizing, but not for size.

Rationale: internal_pack and internal_unpack are opaque to the compiler.
This can lead to a lot of information loss for inlining and inter-
procedural optimization, and in extreme cases can lead to huge
slowdowns.

I don't want to do this for -Os or for -O0.  -Os because I want to avoid
size increases, and -O0 for several reasons: The current method works
well, if there should turn out to be a bug still hiding in this code I
want to at least have "works with -O0" in the bug report, and finally
I did not want to rewrite all test cases.

Because run test cases cycle through a lot of optimization options,
I had to split some of them up - test the pattern matches with -O0, test
for run time correctness under all the options.

I have regression-tested this.  I would, however, prefer if some people
could run this patch against their non-testsuite code and report
any problems that this may introduce. So, if you can spare the time
and the cycles, that would be great.

The nice thing about this kind of patch is that, if this does not
work for a certain condition, it is usually straightforward to
check for the condition and then simply not do the optimization.

So, comments?  Bug reports?  OK for trunk if nobody has come
up with a bug in the next few days?

Regards

	Thomas

2019-01-22  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/88821
         * expr.c (gfc_is_simply_contiguous): Return true for
         an EXPR_ARRAY.
         * trans-array.c (is_pointer): New function.
         (gfc_conv_array_parameter): Call gfc_conv_subref_array_arg
         when not optimizing and not optimizing for size if the formal
         arg is passed by reference.
         * trans-expr.c (gfc_conv_subref_array_arg): Add arguments
         fsym, proc_name and sym.  Add run-time warning for temporary
         array creation.  Wrap argument if passing on an optional
         argument to an optional argument.
         * trans.h (gfc_conv_subref_array_arg): Add optional arguments
         fsym, proc_name and sym to prototype.

2019-01-22  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/88821
         * gfortran.dg/alloc_comp_auto_array_3.f90: Add -O0 to dg-options
         to make sure the test for internal_pack is retained.
         * gfortran.dg/assumed_type_2.f90: Split compile and run time
         tests into this and
         * gfortran.dg/assumed_type_2a.f90: New file.
         * gfortran.dg/c_loc_test_22.f90: Likewise.
         * gfortran.dg/contiguous_3.f90: Likewise.
         * gfortran.dg/internal_pack_11.f90: Likewise.
         * gfortran.dg/internal_pack_12.f90: Likewise.
         * gfortran.dg/internal_pack_16.f90: Likewise.
         * gfortran.dg/internal_pack_17.f90: Likewise.
         * gfortran.dg/internal_pack_18.f90: Likewise.
         * gfortran.dg/internal_pack_4.f90: Likewise.
         * gfortran.dg/internal_pack_5.f90: Add -O0 to dg-options
         to make sure the test for internal_pack is retained.
         * gfortran.dg/internal_pack_6.f90: Split compile and run time
         tests into this and
         * gfortran.dg/internal_pack_6a.f90: New file.
         * gfortran.dg/internal_pack_8.f90: Likewise.
         * gfortran.dg/missing_optional_dummy_6: Split compile and run time
         tests into this and
         * gfortran.dg/missing_optional_dummy_6a.f90: New file.
         * gfortran.dg/no_arg_check_2.f90: Split compile and run time tests
         into this and
         * gfortran.dg/no_arg_check_2a.f90: New file.
         * gfortran.dg/typebound_assignment_5.f90: Split compile and run 
time
         tests into this and
         * gfortran.dg/typebound_assignment_5a.f90: New file.
         * gfortran.dg/typebound_assignment_6.f90: Split compile and run 
time
         tests into this and
         * gfortran.dg/typebound_assignment_6a.f90: New file.
         * gfortran.dg/internal_pack_19.f90: New file.
         * gfortran.dg/internal_pack_20.f90: New file.

[-- Attachment #2: p6.diff --]
[-- Type: text/x-patch, Size: 13354 bytes --]

Index: fortran/expr.c
===================================================================
--- fortran/expr.c	(revision 268104)
+++ fortran/expr.c	(working copy)
@@ -5582,6 +5582,9 @@ gfc_is_simply_contiguous (gfc_expr *expr, bool str
   gfc_ref *ref, *part_ref = NULL;
   gfc_symbol *sym;
 
+  if (expr->expr_type == EXPR_ARRAY)
+    return true;
+
   if (expr->expr_type == EXPR_FUNCTION)
     {
       if (expr->value.function.esym)
Index: fortran/trans-array.c
===================================================================
--- fortran/trans-array.c	(revision 268104)
+++ fortran/trans-array.c	(working copy)
@@ -7755,6 +7755,23 @@ array_parameter_size (tree desc, gfc_expr *expr, t
 			   *size, fold_convert (gfc_array_index_type, elem));
 }
 
+/* Helper function - return true if the argument is a pointer.  */
+ 
+static bool
+is_pointer (gfc_expr *e)
+{
+  gfc_symbol *sym;
+
+  if (e->expr_type != EXPR_VARIABLE ||  e->symtree == NULL)
+    return false;
+
+  sym = e->symtree->n.sym;
+  if (sym == NULL)
+    return false;
+
+  return sym->attr.pointer || sym->attr.proc_pointer;
+}
+
 /* Convert an array for passing as an actual parameter.  */
 
 void
@@ -8006,6 +8023,19 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr *
 			 "Creating array temporary at %L", &expr->where);
 	}
 
+      /* When optmizing, we can use gfc_conv_subref_array_arg for
+	 making the packing and unpacking operation visible to the
+	 optimizers.  */
+
+      if (g77 && optimize && !optimize_size && expr->expr_type == EXPR_VARIABLE
+	  && !is_pointer (expr))
+	{
+	  gfc_conv_subref_array_arg (se, expr, g77,
+				     fsym ? fsym->attr.intent : INTENT_INOUT,
+				     false, fsym, proc_name, sym);
+	  return;
+	}
+
       ptr = build_call_expr_loc (input_location,
 			     gfor_fndecl_in_pack, 1, desc);
 
Index: fortran/trans-expr.c
===================================================================
--- fortran/trans-expr.c	(revision 268104)
+++ fortran/trans-expr.c	(working copy)
@@ -4536,7 +4536,9 @@ gfc_apply_interface_mapping (gfc_interface_mapping
    after the function call.  */
 void
 gfc_conv_subref_array_arg (gfc_se * parmse, gfc_expr * expr, int g77,
-			   sym_intent intent, bool formal_ptr)
+			   sym_intent intent, bool formal_ptr,
+			   const gfc_symbol *fsym, const char *proc_name,
+			   gfc_symbol *sym)
 {
   gfc_se lse;
   gfc_se rse;
@@ -4553,7 +4555,25 @@ gfc_conv_subref_array_arg (gfc_se * parmse, gfc_ex
   stmtblock_t body;
   int n;
   int dimen;
+  tree parmse_expr;
 
+  if (gfc_option.rtcheck & GFC_RTCHECK_ARRAY_TEMPS)
+    {
+      /* We will create a temporary array, so let us warn.  */
+      char * msg;
+
+      if (fsym && proc_name)
+	msg = xasprintf ("An array temporary was created for argument "
+			     "'%s' of procedure '%s'", fsym->name, proc_name);
+      else
+	msg = xasprintf ("An array temporary was created");
+
+      tmp = build_int_cst (logical_type_node, 1);
+      gfc_trans_runtime_check (false, true, tmp, &parmse->pre,
+			       &expr->where, msg);
+      free (msg);
+    }
+
   gfc_init_se (&lse, NULL);
   gfc_init_se (&rse, NULL);
 
@@ -4803,10 +4823,25 @@ class_array_fcn:
   /* We want either the address for the data or the address of the descriptor,
      depending on the mode of passing array arguments.  */
   if (g77)
-    parmse->expr = gfc_conv_descriptor_data_get (parmse->expr);
+    parmse_expr = gfc_conv_descriptor_data_get (parmse->expr);
   else
-    parmse->expr = gfc_build_addr_expr (NULL_TREE, parmse->expr);
+    parmse_expr = gfc_build_addr_expr (NULL_TREE, parmse->expr);
 
+  /* Wrap in "if (present(x))" if needed.  */
+
+  if (fsym && fsym->attr.optional && sym && sym->attr.optional)
+    {
+      tree present;
+      tree type;
+
+      present = gfc_conv_expr_present (sym);
+      type = TREE_TYPE (parmse_expr);
+      tmp = fold_build3_loc (input_location, COND_EXPR, type, present,
+			     parmse_expr, build_int_cst (type, 0));
+      parmse_expr = tmp;
+    }
+
+  parmse->expr = parmse_expr;
   return;
 }
 
Index: fortran/trans.h
===================================================================
--- fortran/trans.h	(revision 268104)
+++ fortran/trans.h	(working copy)
@@ -529,7 +529,10 @@ int gfc_is_intrinsic_libcall (gfc_expr *);
 int gfc_conv_procedure_call (gfc_se *, gfc_symbol *, gfc_actual_arglist *,
 			     gfc_expr *, vec<tree, va_gc> *);
 
-void gfc_conv_subref_array_arg (gfc_se *, gfc_expr *, int, sym_intent, bool);
+void gfc_conv_subref_array_arg (gfc_se *, gfc_expr *, int, sym_intent, bool,
+				const gfc_symbol *fsym = NULL,
+				const char *proc_name = NULL,
+				gfc_symbol *sym = NULL);
 
 /* Generate code for a scalar assignment.  */
 tree gfc_trans_scalar_assign (gfc_se *, gfc_se *, gfc_typespec, bool, bool,
Index: testsuite/gfortran.dg/alloc_comp_auto_array_3.f90
===================================================================
--- testsuite/gfortran.dg/alloc_comp_auto_array_3.f90	(revision 268104)
+++ testsuite/gfortran.dg/alloc_comp_auto_array_3.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR66082. The original problem was with the first
 ! call foo_1d.
Index: testsuite/gfortran.dg/assumed_type_2.f90
===================================================================
--- testsuite/gfortran.dg/assumed_type_2.f90	(revision 268104)
+++ testsuite/gfortran.dg/assumed_type_2.f90	(working copy)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/48820
 !
Index: testsuite/gfortran.dg/c_loc_test_22.f90
===================================================================
--- testsuite/gfortran.dg/c_loc_test_22.f90	(revision 268104)
+++ testsuite/gfortran.dg/c_loc_test_22.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/56907
 !
Index: testsuite/gfortran.dg/contiguous_3.f90
===================================================================
--- testsuite/gfortran.dg/contiguous_3.f90	(revision 268104)
+++ testsuite/gfortran.dg/contiguous_3.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/40632
 !
Index: testsuite/gfortran.dg/internal_pack_11.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_11.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_11.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR43173, where unnecessary calls to internal_pack/unpack
 ! were being produced below. These references are contiguous and so do not
Index: testsuite/gfortran.dg/internal_pack_12.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_12.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_12.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR43243, where unnecessary calls to internal_pack/unpack
 ! were being produced below. These references are contiguous and so do not
Index: testsuite/gfortran.dg/internal_pack_16.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_16.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_16.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-O0 -fdump-tree-original" }
 ! PR 59345 - pack/unpack was not needed here.
 SUBROUTINE S1(A)
  REAL :: A(3)
Index: testsuite/gfortran.dg/internal_pack_17.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_17.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_17.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-O0 -fdump-tree-original" }
 ! PR 59345 - pack/unpack was not needed here.
 ! Original test case by Joost VandeVondele 
 SUBROUTINE S1(A)
Index: testsuite/gfortran.dg/internal_pack_18.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_18.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_18.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-O0 -fdump-tree-original" }
 ! PR 57992 - this was packed/unpacked unnecessarily.
 ! Original case by Tobias Burnus.
 subroutine test
Index: testsuite/gfortran.dg/internal_pack_4.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_4.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_4.f90	(working copy)
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-fdump-tree-original" }
 !
 ! PR fortran/36132
 !
@@ -25,6 +24,3 @@ END MODULE M1
 USE M1
 CALL S2()
 END
-
-! { dg-final { scan-tree-dump-times "a != 0B \\? \\\(.*\\\) _gfortran_internal_pack" 1 "original" } }
-! { dg-final { scan-tree-dump-times "if \\(a != 0B &&" 1 "original" } }
Index: testsuite/gfortran.dg/internal_pack_5.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_5.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_5.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/36909
 !
Index: testsuite/gfortran.dg/internal_pack_6.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_6.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_6.f90	(working copy)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR41113 and PR41117, in which unnecessary calls
 ! to internal_pack and internal_unpack were being generated.
Index: testsuite/gfortran.dg/internal_pack_9.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_9.f90	(revision 268104)
+++ testsuite/gfortran.dg/internal_pack_9.f90	(working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! During the discussion of the fix for PR43072, in which unnecessary
 ! calls to internal PACK/UNPACK were being generated, the following,
Index: testsuite/gfortran.dg/missing_optional_dummy_6.f90
===================================================================
--- testsuite/gfortran.dg/missing_optional_dummy_6.f90	(revision 268104)
+++ testsuite/gfortran.dg/missing_optional_dummy_6.f90	(working copy)
@@ -46,14 +46,3 @@ contains
   end subroutine scalar2
 
 end program test
-
-! { dg-final { scan-tree-dump-times "scalar2 \\(slr1" 1 "original" } }
-
-! { dg-final { scan-tree-dump-times "= es1 != 0B" 1 "original" } }
-! { dg-final { scan-tree-dump-times "assumed_shape2 \\(es1" 0 "original" } }
-! { dg-final { scan-tree-dump-times "explicit_shape2 \\(es1" 1 "original" } }
-
-! { dg-final { scan-tree-dump-times "= as1 != 0B" 2 "original" } }
-! { dg-final { scan-tree-dump-times "assumed_shape2 \\(as1" 0 "original" } }
-! { dg-final { scan-tree-dump-times "explicit_shape2 \\(as1" 0 "original" } }
-
Index: testsuite/gfortran.dg/no_arg_check_2.f90
===================================================================
--- testsuite/gfortran.dg/no_arg_check_2.f90	(revision 268104)
+++ testsuite/gfortran.dg/no_arg_check_2.f90	(working copy)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/39505
 ! 
Index: testsuite/gfortran.dg/typebound_assignment_5.f03
===================================================================
--- testsuite/gfortran.dg/typebound_assignment_5.f03	(revision 268104)
+++ testsuite/gfortran.dg/typebound_assignment_5.f03	(working copy)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/49074
 ! ICE on defined assignment with class arrays.
Index: testsuite/gfortran.dg/typebound_assignment_6.f03
===================================================================
--- testsuite/gfortran.dg/typebound_assignment_6.f03	(revision 268104)
+++ testsuite/gfortran.dg/typebound_assignment_6.f03	(working copy)
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-fdump-tree-original" }
 !
 ! PR fortran/56136
 ! ICE on defined assignment with class arrays.
@@ -37,6 +36,3 @@
         IF (ANY(A(2::2)%I /= (/ ((50+2*I, I=1,SIZE(A)/4), J=1,2) /))) STOP 3
       END PROGRAM
 
-! { dg-final { scan-tree-dump-times "_gfortran_internal_pack" 1 "original" } }
-! { dg-final { scan-tree-dump-times "_gfortran_internal_unpack" 1 "original" } }
-

[-- Attachment #3: internal_pack_19.f90 --]
[-- Type: text/x-fortran, Size: 539 bytes --]

! { dg-do compile }
! { dg-options "-Os -fdump-tree-original" }
! Check that internal_pack is called with -Os.
module x
  implicit none
contains
  subroutine bar(a, n)
    integer, intent(in) :: n
    integer, intent(in), dimension(n) :: a
    print *,a
  end subroutine bar
end module x

program main
  use x
  implicit none
  integer, parameter :: n = 10
  integer, dimension(n) :: a
  integer :: i
  a = [(i,i=1,n)]
  call bar(a(n:1:-1),n)
end program main
! { dg-final { scan-tree-dump-times "_gfortran_internal_pack" 1 "original" } }

[-- Attachment #4: internal_pack_20.f90 --]
[-- Type: text/x-fortran, Size: 537 bytes --]

! { dg-do compile }
! { dg-options "-O -fdump-tree-original" }
! Check that internal_pack is not called with -O.
module x
  implicit none
contains
  subroutine bar(a, n)
    integer, intent(in) :: n
    integer, intent(in), dimension(n) :: a
    print *,a
  end subroutine bar
end module x

program main
  use x
  implicit none
  integer, parameter :: n = 10
  integer, dimension(n) :: a
  integer :: i
  a = [(i,i=1,n)]
  call bar(a(n:1:-1),n)
end program main
! { dg-final { scan-tree-dump-not "_gfortran_internal_pack" "original" } }

[-- Attachment #5: missing_optional_dummy_6a.f90 --]
[-- Type: text/x-fortran, Size: 1622 bytes --]

! { dg-do compile }
! { dg-options "-O0 -fdump-tree-original" }
!
! PR fortran/41907
!
program test
  implicit none
  call scalar1 ()
  call assumed_shape1 ()
  call explicit_shape1 ()
contains

  ! Calling functions
  subroutine scalar1 (slr1)
    integer, optional :: slr1
    call scalar2 (slr1)
  end subroutine scalar1

  subroutine assumed_shape1 (as1)
    integer, dimension(:), optional :: as1
    call assumed_shape2 (as1)
    call explicit_shape2 (as1)
  end subroutine assumed_shape1

  subroutine explicit_shape1 (es1)
    integer, dimension(5), optional :: es1
    call assumed_shape2 (es1)
    call explicit_shape2 (es1)
  end subroutine explicit_shape1


  ! Called functions
  subroutine assumed_shape2 (as2)
    integer, dimension(:),optional :: as2
    if (present (as2)) STOP 1
  end subroutine assumed_shape2

  subroutine explicit_shape2 (es2)
    integer, dimension(5),optional :: es2
    if (present (es2)) STOP 2
  end subroutine explicit_shape2

  subroutine scalar2 (slr2)
    integer, optional :: slr2
    if (present (slr2)) STOP 3
  end subroutine scalar2

end program test

! { dg-final { scan-tree-dump-times "scalar2 \\(slr1" 1 "original" } }

! { dg-final { scan-tree-dump-times "= es1 != 0B" 1 "original" } }
! { dg-final { scan-tree-dump-times "assumed_shape2 \\(es1" 0 "original" } }
! { dg-final { scan-tree-dump-times "explicit_shape2 \\(es1" 1 "original" } }

! { dg-final { scan-tree-dump-times "= as1 != 0B" 2 "original" } }
! { dg-final { scan-tree-dump-times "assumed_shape2 \\(as1" 0 "original" } }
! { dg-final { scan-tree-dump-times "explicit_shape2 \\(as1" 0 "original" } }


[-- Attachment #6: typebound_assignment_6a.f03 --]
[-- Type: text/plain, Size: 1267 bytes --]

! { dg-do compile }
! { dg-options "-O0 -fdump-tree-original" }
!
! PR fortran/56136
! ICE on defined assignment with class arrays.
!
! Original testcase by Alipasha <alipash.celeris@gmail.com>

      MODULE A_TEST_M
        TYPE :: A_TYPE
          INTEGER :: I
          CONTAINS
          GENERIC :: ASSIGNMENT (=) => ASGN_A
          PROCEDURE, PRIVATE :: ASGN_A
        END TYPE

        CONTAINS

        ELEMENTAL SUBROUTINE ASGN_A (A, B)
          CLASS (A_TYPE), INTENT (INOUT) :: A
          CLASS (A_TYPE), INTENT (IN) :: B
          A%I = B%I
        END SUBROUTINE
      END MODULE A_TEST_M
      
      PROGRAM ASGN_REALLOC_TEST
        USE A_TEST_M
        TYPE (A_TYPE), ALLOCATABLE :: A(:)
        INTEGER :: I, J

        ALLOCATE (A(100))
        A = (/ (A_TYPE(I), I=1,SIZE(A)) /)
        A(1:50) = A(51:100)
        IF (ANY(A%I /= (/ ((50+I, I=1,SIZE(A)/2), J=1,2) /))) STOP 1
        A(::2) = A(1:50)        ! pack/unpack
        IF (ANY(A( ::2)%I /= (/ (50+I, I=1,SIZE(A)/2) /))) STOP 2
        IF (ANY(A(2::2)%I /= (/ ((50+2*I, I=1,SIZE(A)/4), J=1,2) /))) STOP 3
      END PROGRAM

! { dg-final { scan-tree-dump-times "_gfortran_internal_pack" 1 "original" } }
! { dg-final { scan-tree-dump-times "_gfortran_internal_unpack" 1 "original" } }


[-- Attachment #7: typebound_assignment_5a.f03 --]
[-- Type: text/plain, Size: 797 bytes --]

! { dg-do run }
!
! PR fortran/49074
! ICE on defined assignment with class arrays.

      module foo
        type bar
          integer :: i

          contains

          generic :: assignment (=) => assgn_bar
          procedure, private :: assgn_bar
        end type bar

        contains

        elemental subroutine assgn_bar (a, b)
          class (bar), intent (inout) :: a
          class (bar), intent (in) :: b

          select type (b)
          type is (bar)
            a%i = b%i
          end select

          return
        end subroutine assgn_bar
      end module foo

      program main
        use foo

        type (bar), allocatable :: foobar(:)

        allocate (foobar(2))
        foobar = [bar(1), bar(2)]
        if (any(foobar%i /= [1, 2])) STOP 1
      end program

[-- Attachment #8: no_arg_check_2a.f90 --]
[-- Type: text/x-fortran, Size: 3477 bytes --]

! { dg-do run }
!
! PR fortran/39505
! 
! Test NO_ARG_CHECK
! Copied from assumed_type_2.f90
!

module mod
  use iso_c_binding, only: c_loc, c_ptr, c_bool
  implicit none
  interface my_c_loc
    function my_c_loc1(x) bind(C)
      import c_ptr
!GCC$ attributes NO_ARG_CHECK :: x
      type(*) :: x
      type(c_ptr) :: my_c_loc1
    end function
  end interface my_c_loc
contains
  subroutine sub_scalar (arg1, presnt)
     integer(8), target, optional :: arg1
     logical :: presnt
     type(c_ptr) :: cpt
!GCC$ attributes NO_ARG_CHECK :: arg1
     if (presnt .neqv. present (arg1)) STOP 1
     cpt = c_loc (arg1)
  end subroutine sub_scalar

  subroutine sub_array_assumed (arg3)
!GCC$ attributes NO_ARG_CHECK :: arg3
     logical(1), target :: arg3(*)
     type(c_ptr) :: cpt
     cpt = c_loc (arg3)
  end subroutine sub_array_assumed
end module

use mod
use iso_c_binding, only: c_int, c_null_ptr
implicit none
type t1
  integer :: a
end type t1
type :: t2
  sequence
  integer :: b
end type t2
type, bind(C) :: t3
  integer(c_int) :: c
end type t3

integer            :: scalar_int
real, allocatable  :: scalar_real_alloc
character, pointer :: scalar_char_ptr

integer            :: array_int(3)
real, allocatable  :: array_real_alloc(:,:)
character, pointer :: array_char_ptr(:,:)

type(t1)              :: scalar_t1
type(t2), allocatable :: scalar_t2_alloc
type(t3), pointer     :: scalar_t3_ptr

type(t1)              :: array_t1(4)
type(t2), allocatable :: array_t2_alloc(:,:)
type(t3), pointer     :: array_t3_ptr(:,:)

class(t1), allocatable :: scalar_class_t1_alloc
class(t1), pointer     :: scalar_class_t1_ptr

class(t1), allocatable :: array_class_t1_alloc(:,:)
class(t1), pointer     :: array_class_t1_ptr(:,:)

scalar_char_ptr => null()
scalar_t3_ptr => null()

call sub_scalar (presnt=.false.)
call sub_scalar (scalar_real_alloc, .false.)
call sub_scalar (scalar_char_ptr, .false.)
call sub_scalar (null (), .false.)
call sub_scalar (scalar_t2_alloc, .false.)
call sub_scalar (scalar_t3_ptr, .false.)

allocate (scalar_real_alloc, scalar_char_ptr, scalar_t3_ptr)
allocate (scalar_class_t1_alloc, scalar_class_t1_ptr, scalar_t2_alloc)
allocate (array_real_alloc(3:5,2:4), array_char_ptr(-2:2,2))
allocate (array_t2_alloc(3:5,2:4), array_t3_ptr(-2:2,2))
allocate (array_class_t1_alloc(3,3), array_class_t1_ptr(4,4))

call sub_scalar (scalar_int, .true.)
call sub_scalar (scalar_real_alloc, .true.)
call sub_scalar (scalar_char_ptr, .true.)
call sub_scalar (array_int(2), .true.)
call sub_scalar (array_real_alloc(3,2), .true.)
call sub_scalar (array_char_ptr(0,1), .true.)
call sub_scalar (scalar_t1, .true.)
call sub_scalar (scalar_t2_alloc, .true.)
call sub_scalar (scalar_t3_ptr, .true.)
call sub_scalar (array_t1(2), .true.)
call sub_scalar (array_t2_alloc(3,2), .true.)
call sub_scalar (array_t3_ptr(0,1), .true.)
call sub_scalar (array_class_t1_alloc(2,1), .true.)
call sub_scalar (array_class_t1_ptr(3,3), .true.)

call sub_array_assumed (array_int)
call sub_array_assumed (array_real_alloc)
call sub_array_assumed (array_char_ptr)
call sub_array_assumed (array_t1)
call sub_array_assumed (array_t2_alloc)
call sub_array_assumed (array_t3_ptr)
call sub_array_assumed (array_class_t1_alloc)
call sub_array_assumed (array_class_t1_ptr)

deallocate (scalar_char_ptr, scalar_class_t1_ptr, array_char_ptr)
deallocate (array_class_t1_ptr, array_t3_ptr)
contains
  subroutine sub(x)
    integer :: x(:)
    call sub_array_assumed (x)
  end subroutine sub
end

[-- Attachment #9: internal_pack_6a.f90 --]
[-- Type: text/x-fortran, Size: 1217 bytes --]

! { dg-do run }
!
! Test the fix for PR41113 and PR41117, in which unnecessary calls
! to internal_pack and internal_unpack were being generated.
!
! Contributed by Joost VandeVondele <jv244@cam.ac.uk>
!
MODULE M1
 TYPE T1
   REAL :: data(10) = [(i, i = 1, 10)]
 END TYPE T1
CONTAINS
 SUBROUTINE S1(data, i, chksum)
   REAL, DIMENSION(*) :: data
   integer :: i, j
   real :: subsum, chksum
   subsum = 0
   do j = 1, i
     subsum = subsum + data(j)
   end do
   if (abs(subsum - chksum) > 1e-6) STOP 1
 END SUBROUTINE S1
END MODULE

SUBROUTINE S2
 use m1
 TYPE(T1) :: d

 real :: data1(10) = [(i, i = 1, 10)]
 REAL :: data(-4:5,-4:5) = reshape ([(real(i), i = 1, 100)], [10,10])

! PR41113
 CALL S1(d%data, 10, sum (d%data))
 CALL S1(data1, 10, sum (data1))

! PR41117
 DO i=-4,5
    CALL S1(data(:,i), 10, sum (data(:,i)))
 ENDDO

! With the fix for PR41113/7 this is the only time that _internal_pack
! was called.  The final part of the fix for PR43072 put paid to it too.
 DO i=-4,5
    CALL S1(data(-2:,i), 8, sum (data(-2:,i)))
 ENDDO
 DO i=-4,4
    CALL S1(data(:,i:i+1), 20, sum (reshape (data(:,i:i+1), [20])))
 ENDDO
 DO i=-4,5
    CALL S1(data(2,i), 1, data(2,i))
 ENDDO
END SUBROUTINE S2

 call s2
end


[-- Attachment #10: assumed_type_2a.f90 --]
[-- Type: text/x-fortran, Size: 4362 bytes --]

! { dg-do run }
!
! PR fortran/48820
!
! Test TYPE(*)
!

module mod
  use iso_c_binding, only: c_loc, c_ptr, c_bool
  implicit none
  interface my_c_loc
    function my_c_loc1(x) bind(C)
      import c_ptr
      type(*) :: x
      type(c_ptr) :: my_c_loc1
    end function
    function my_c_loc2(x) bind(C)
      import c_ptr
      type(*) :: x(*)
      type(c_ptr) :: my_c_loc2
    end function
  end interface my_c_loc
contains
  subroutine sub_scalar (arg1, presnt)
     type(*), target, optional :: arg1
     logical :: presnt
     type(c_ptr) :: cpt
     if (presnt .neqv. present (arg1)) STOP 1
     cpt = c_loc (arg1)
  end subroutine sub_scalar

  subroutine sub_array_shape (arg2, lbounds, ubounds)
     type(*), target :: arg2(:,:)
     type(c_ptr) :: cpt
     integer :: lbounds(2), ubounds(2)
     if (any (lbound(arg2) /= lbounds)) STOP 2
     if (any (ubound(arg2) /= ubounds)) STOP 3
     if (any (shape(arg2) /= ubounds-lbounds+1)) STOP 4
     if (size(arg2) /= product (ubounds-lbounds+1)) STOP 5
     if (rank (arg2) /= 2) STOP 6
!     if (.not. is_continuous (arg2)) STOP 7 !<< Not yet implemented
!     cpt = c_loc (arg2) ! << FIXME: Valid since TS29113
     call sub_array_assumed (arg2)
  end subroutine sub_array_shape

  subroutine sub_array_assumed (arg3)
     type(*), target :: arg3(*)
     type(c_ptr) :: cpt
     cpt = c_loc (arg3)
  end subroutine sub_array_assumed
end module

use mod
use iso_c_binding, only: c_int, c_null_ptr
implicit none
type t1
  integer :: a
end type t1
type :: t2
  sequence
  integer :: b
end type t2
type, bind(C) :: t3
  integer(c_int) :: c
end type t3

integer            :: scalar_int
real, allocatable  :: scalar_real_alloc
character, pointer :: scalar_char_ptr

integer            :: array_int(3)
real, allocatable  :: array_real_alloc(:,:)
character, pointer :: array_char_ptr(:,:)

type(t1)              :: scalar_t1
type(t2), allocatable :: scalar_t2_alloc
type(t3), pointer     :: scalar_t3_ptr

type(t1)              :: array_t1(4)
type(t2), allocatable :: array_t2_alloc(:,:)
type(t3), pointer     :: array_t3_ptr(:,:)

class(t1), allocatable :: scalar_class_t1_alloc
class(t1), pointer     :: scalar_class_t1_ptr

class(t1), allocatable :: array_class_t1_alloc(:,:)
class(t1), pointer     :: array_class_t1_ptr(:,:)

scalar_char_ptr => null()
scalar_t3_ptr => null()

call sub_scalar (presnt=.false.)
call sub_scalar (scalar_real_alloc, .false.)
call sub_scalar (scalar_char_ptr, .false.)
call sub_scalar (null (), .false.)
call sub_scalar (scalar_t2_alloc, .false.)
call sub_scalar (scalar_t3_ptr, .false.)

allocate (scalar_real_alloc, scalar_char_ptr, scalar_t3_ptr)
allocate (scalar_class_t1_alloc, scalar_class_t1_ptr, scalar_t2_alloc)
allocate (array_real_alloc(3:5,2:4), array_char_ptr(-2:2,2))
allocate (array_t2_alloc(3:5,2:4), array_t3_ptr(-2:2,2))
allocate (array_class_t1_alloc(3,3), array_class_t1_ptr(4,4))

call sub_scalar (scalar_int, .true.)
call sub_scalar (scalar_real_alloc, .true.)
call sub_scalar (scalar_char_ptr, .true.)
call sub_scalar (array_int(2), .true.)
call sub_scalar (array_real_alloc(3,2), .true.)
call sub_scalar (array_char_ptr(0,1), .true.)
call sub_scalar (scalar_t1, .true.)
call sub_scalar (scalar_t2_alloc, .true.)
call sub_scalar (scalar_t3_ptr, .true.)
call sub_scalar (array_t1(2), .true.)
call sub_scalar (array_t2_alloc(3,2), .true.)
call sub_scalar (array_t3_ptr(0,1), .true.)
call sub_scalar (array_class_t1_alloc(2,1), .true.)
call sub_scalar (array_class_t1_ptr(3,3), .true.)

call sub_array_assumed (array_int)
call sub_array_assumed (array_real_alloc)
call sub_array_assumed (array_char_ptr)
call sub_array_assumed (array_t1)
call sub_array_assumed (array_t2_alloc)
call sub_array_assumed (array_t3_ptr)
call sub_array_assumed (array_class_t1_alloc)
call sub_array_assumed (array_class_t1_ptr)

call sub_array_shape (array_real_alloc, [1,1], shape(array_real_alloc))
call sub_array_shape (array_char_ptr, [1,1], shape(array_char_ptr))
call sub_array_shape (array_t2_alloc, [1,1], shape(array_t2_alloc))
call sub_array_shape (array_t3_ptr, [1,1], shape(array_t3_ptr))
call sub_array_shape (array_class_t1_alloc, [1,1], shape(array_class_t1_alloc))
call sub_array_shape (array_class_t1_ptr, [1,1], shape(array_class_t1_ptr))

deallocate (scalar_char_ptr, scalar_class_t1_ptr, array_char_ptr)
deallocate (array_class_t1_ptr, array_t3_ptr)

end

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, fortran] Move some array packing to front end
  2019-01-22 21:10 [patch, fortran] Move some array packing to front end Thomas Koenig
@ 2019-01-23  8:22 ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2019-01-23  8:22 UTC (permalink / raw)
  To: Thomas Koenig; +Cc: fortran, gcc-patches

On Tue, Jan 22, 2019 at 9:59 PM Thomas Koenig <tkoenig@netcologne.de> wrote:
>
> Hello world,
>
> the attached patch moves the packing / unpacking of arrays to the front
> end when optimizing, but not for size.
>
> Rationale: internal_pack and internal_unpack are opaque to the compiler.
> This can lead to a lot of information loss for inlining and inter-
> procedural optimization, and in extreme cases can lead to huge
> slowdowns.
>
> I don't want to do this for -Os or for -O0.  -Os because I want to avoid
> size increases, and -O0 for several reasons: The current method works
> well, if there should turn out to be a bug still hiding in this code I
> want to at least have "works with -O0" in the bug report, and finally
> I did not want to rewrite all test cases.
>
> Because run test cases cycle through a lot of optimization options,
> I had to split some of them up - test the pattern matches with -O0, test
> for run time correctness under all the options.
>
> I have regression-tested this.  I would, however, prefer if some people
> could run this patch against their non-testsuite code and report
> any problems that this may introduce. So, if you can spare the time
> and the cycles, that would be great.
>
> The nice thing about this kind of patch is that, if this does not
> work for a certain condition, it is usually straightforward to
> check for the condition and then simply not do the optimization.
>
> So, comments?  Bug reports?  OK for trunk if nobody has come
> up with a bug in the next few days?

Note for this kind of changes it is approprate to wait for stage1 to open.

Of course since you're not release critical you can override this
suggestion as you please.

Richard.

> Regards
>
>         Thomas
>
> 2019-01-22  Thomas Koenig  <tkoenig@gcc.gnu.org>
>
>          PR fortran/88821
>          * expr.c (gfc_is_simply_contiguous): Return true for
>          an EXPR_ARRAY.
>          * trans-array.c (is_pointer): New function.
>          (gfc_conv_array_parameter): Call gfc_conv_subref_array_arg
>          when not optimizing and not optimizing for size if the formal
>          arg is passed by reference.
>          * trans-expr.c (gfc_conv_subref_array_arg): Add arguments
>          fsym, proc_name and sym.  Add run-time warning for temporary
>          array creation.  Wrap argument if passing on an optional
>          argument to an optional argument.
>          * trans.h (gfc_conv_subref_array_arg): Add optional arguments
>          fsym, proc_name and sym to prototype.
>
> 2019-01-22  Thomas Koenig  <tkoenig@gcc.gnu.org>
>
>          PR fortran/88821
>          * gfortran.dg/alloc_comp_auto_array_3.f90: Add -O0 to dg-options
>          to make sure the test for internal_pack is retained.
>          * gfortran.dg/assumed_type_2.f90: Split compile and run time
>          tests into this and
>          * gfortran.dg/assumed_type_2a.f90: New file.
>          * gfortran.dg/c_loc_test_22.f90: Likewise.
>          * gfortran.dg/contiguous_3.f90: Likewise.
>          * gfortran.dg/internal_pack_11.f90: Likewise.
>          * gfortran.dg/internal_pack_12.f90: Likewise.
>          * gfortran.dg/internal_pack_16.f90: Likewise.
>          * gfortran.dg/internal_pack_17.f90: Likewise.
>          * gfortran.dg/internal_pack_18.f90: Likewise.
>          * gfortran.dg/internal_pack_4.f90: Likewise.
>          * gfortran.dg/internal_pack_5.f90: Add -O0 to dg-options
>          to make sure the test for internal_pack is retained.
>          * gfortran.dg/internal_pack_6.f90: Split compile and run time
>          tests into this and
>          * gfortran.dg/internal_pack_6a.f90: New file.
>          * gfortran.dg/internal_pack_8.f90: Likewise.
>          * gfortran.dg/missing_optional_dummy_6: Split compile and run time
>          tests into this and
>          * gfortran.dg/missing_optional_dummy_6a.f90: New file.
>          * gfortran.dg/no_arg_check_2.f90: Split compile and run time tests
>          into this and
>          * gfortran.dg/no_arg_check_2a.f90: New file.
>          * gfortran.dg/typebound_assignment_5.f90: Split compile and run
> time
>          tests into this and
>          * gfortran.dg/typebound_assignment_5a.f90: New file.
>          * gfortran.dg/typebound_assignment_6.f90: Split compile and run
> time
>          tests into this and
>          * gfortran.dg/typebound_assignment_6a.f90: New file.
>          * gfortran.dg/internal_pack_19.f90: New file.
>          * gfortran.dg/internal_pack_20.f90: New file.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, fortran] Move some array packing to front end
  2019-01-24 11:01   ` Richard Biener
@ 2019-02-07  6:31     ` Chris Elrod
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Elrod @ 2019-02-07  6:31 UTC (permalink / raw)
  To: Richard Biener
  Cc: Thomas Koenig, Dominique d'Humières, gfortran, gcc-patches

Hi,
I just compiled gcc-trunk after applying your patches (p8.diff) and H.J.
Lu's patches from *Bug 88713*
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713> (vectorized code slow
vs Flang) fixing rsqrt.

Thomas Koenig already confirmed it fixed the problem I recorded there, but
figured I'd add that I can also confirm it improved the benchmarks
dramatically. gfortran and g++ now seem to be the fastest (one of the
Fortran benchmarks saw over 100x improvement). Between those two patches,
Bug 88713 will be solved.

LLVM also seems to have an issue where, if you don't store inside the
function being called within the loop, the loop body ends up with a lot of
leaq instructions. gcc now seems to produce the same code regardless
(without those unnecessary instructions). Obviously, just one simple narrow
example, but exciting to see that improvement!


On Thu, Jan 24, 2019 at 4:56 AM Richard Biener <richard.guenther@gmail.com>
wrote:

> On Wed, Jan 23, 2019 at 6:18 PM Thomas Koenig <tkoenig@netcologne.de>
> wrote:
> >
> > Hi Dominique,
> >
> > > FAIL: gfortran.dg/internal_pack_4.f90   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> > >
> > > with -m32.
> > >
> > > gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/internal_pack_4.f90 -O3
> -funroll-loops -ftracer -m32
> > >
> > > is enough to trigger the miscomputation.
> >
> > Thanks, I will look into it.
> >
> > > The changes in the test suite are quite messy and I hope I did not
> miss any test (you should do "diff -N …" for the new tests).
> >
> > I don't think this is a good idea.  Applying the patch twice will then
> > double the test case.
> >
> >
> > > Do you have test showing a speed-up?
> >
> > It' in the PR.
> >
> >
> > > I agree with Richard that this patch should be held until the next
> stage 1.
> >
> > I just realized that we are, in principle, in regression-only mode.
> >
> > I do wish the announcements were also made to the fortran mailing lists.
>
> They are made to the low-traffic gcc@ list which everybody working on GCC
> is supposed to subscribe to.
>
> > Anyway, I'll see if I can fix that bug, then attach the combined
> > patch to the PR for later inclusion.
> >
> > Regards
> >
> >         Thomas
>


-- 
https://github.com/chriselrod?tab=repositories
https://www.linkedin.com/in/chris-elrod-9720391a/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, fortran] Move some array packing to front end
  2019-01-23 17:27 ` Thomas Koenig
@ 2019-01-24 11:01   ` Richard Biener
  2019-02-07  6:31     ` Chris Elrod
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2019-01-24 11:01 UTC (permalink / raw)
  To: Thomas Koenig; +Cc: Dominique d'Humières, gfortran, gcc-patches

On Wed, Jan 23, 2019 at 6:18 PM Thomas Koenig <tkoenig@netcologne.de> wrote:
>
> Hi Dominique,
>
> > FAIL: gfortran.dg/internal_pack_4.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> >
> > with -m32.
> >
> > gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/internal_pack_4.f90 -O3 -funroll-loops -ftracer -m32
> >
> > is enough to trigger the miscomputation.
>
> Thanks, I will look into it.
>
> > The changes in the test suite are quite messy and I hope I did not miss any test (you should do "diff -N …" for the new tests).
>
> I don't think this is a good idea.  Applying the patch twice will then
> double the test case.
>
>
> > Do you have test showing a speed-up?
>
> It' in the PR.
>
>
> > I agree with Richard that this patch should be held until the next stage 1.
>
> I just realized that we are, in principle, in regression-only mode.
>
> I do wish the announcements were also made to the fortran mailing lists.

They are made to the low-traffic gcc@ list which everybody working on GCC
is supposed to subscribe to.

> Anyway, I'll see if I can fix that bug, then attach the combined
> patch to the PR for later inclusion.
>
> Regards
>
>         Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, fortran] Move some array packing to front end
  2019-01-23 15:22 Dominique d'Humières
  2019-01-23 17:27 ` Thomas Koenig
@ 2019-01-23 21:41 ` Thomas Koenig
  1 sibling, 0 replies; 7+ messages in thread
From: Thomas Koenig @ 2019-01-23 21:41 UTC (permalink / raw)
  To: Dominique d'Humières; +Cc: gfortran, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 185 bytes --]

Hi,

for the record, the attached version of the patch regtests cleanly and
also passes the test that Dominique pointed out.

I will defer this until stage 1 reopens.

Regards

	Thomas

[-- Attachment #2: p7.diff --]
[-- Type: text/x-patch, Size: 13520 bytes --]

Index: fortran/expr.c
===================================================================
--- fortran/expr.c	(Revision 268104)
+++ fortran/expr.c	(Arbeitskopie)
@@ -5582,6 +5582,9 @@ gfc_is_simply_contiguous (gfc_expr *expr, bool str
   gfc_ref *ref, *part_ref = NULL;
   gfc_symbol *sym;
 
+  if (expr->expr_type == EXPR_ARRAY)
+    return true;
+
   if (expr->expr_type == EXPR_FUNCTION)
     {
       if (expr->value.function.esym)
Index: fortran/trans-array.c
===================================================================
--- fortran/trans-array.c	(Revision 268104)
+++ fortran/trans-array.c	(Arbeitskopie)
@@ -7755,6 +7755,23 @@ array_parameter_size (tree desc, gfc_expr *expr, t
 			   *size, fold_convert (gfc_array_index_type, elem));
 }
 
+/* Helper function - return true if the argument is a pointer.  */
+ 
+static bool
+is_pointer (gfc_expr *e)
+{
+  gfc_symbol *sym;
+
+  if (e->expr_type != EXPR_VARIABLE ||  e->symtree == NULL)
+    return false;
+
+  sym = e->symtree->n.sym;
+  if (sym == NULL)
+    return false;
+
+  return sym->attr.pointer || sym->attr.proc_pointer;
+}
+
 /* Convert an array for passing as an actual parameter.  */
 
 void
@@ -8006,6 +8023,19 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr *
 			 "Creating array temporary at %L", &expr->where);
 	}
 
+      /* When optmizing, we can use gfc_conv_subref_array_arg for
+	 making the packing and unpacking operation visible to the
+	 optimizers.  */
+
+      if (g77 && optimize && !optimize_size && expr->expr_type == EXPR_VARIABLE
+	  && !is_pointer (expr))
+	{
+	  gfc_conv_subref_array_arg (se, expr, g77,
+				     fsym ? fsym->attr.intent : INTENT_INOUT,
+				     false, fsym, proc_name, sym);
+	  return;
+	}
+
       ptr = build_call_expr_loc (input_location,
 			     gfor_fndecl_in_pack, 1, desc);
 
Index: fortran/trans-expr.c
===================================================================
--- fortran/trans-expr.c	(Revision 268104)
+++ fortran/trans-expr.c	(Arbeitskopie)
@@ -4535,8 +4535,10 @@ gfc_apply_interface_mapping (gfc_interface_mapping
    an actual argument derived type array is copied and then returned
    after the function call.  */
 void
-gfc_conv_subref_array_arg (gfc_se * parmse, gfc_expr * expr, int g77,
-			   sym_intent intent, bool formal_ptr)
+gfc_conv_subref_array_arg (gfc_se *se, gfc_expr * expr, int g77,
+			   sym_intent intent, bool formal_ptr,
+			   const gfc_symbol *fsym, const char *proc_name,
+			   gfc_symbol *sym)
 {
   gfc_se lse;
   gfc_se rse;
@@ -4553,7 +4555,37 @@ void
   stmtblock_t body;
   int n;
   int dimen;
+  gfc_se work_se;
+  gfc_se *parmse;
+  bool pass_optional;
 
+  pass_optional = fsym && fsym->attr.optional && sym && sym->attr.optional;
+
+  if (pass_optional)
+    {
+      gfc_init_se (&work_se, NULL);
+      parmse = &work_se;
+    }
+  else
+    parmse = se;
+
+  if (gfc_option.rtcheck & GFC_RTCHECK_ARRAY_TEMPS)
+    {
+      /* We will create a temporary array, so let us warn.  */
+      char * msg;
+
+      if (fsym && proc_name)
+	msg = xasprintf ("An array temporary was created for argument "
+			     "'%s' of procedure '%s'", fsym->name, proc_name);
+      else
+	msg = xasprintf ("An array temporary was created");
+
+      tmp = build_int_cst (logical_type_node, 1);
+      gfc_trans_runtime_check (false, true, tmp, &parmse->pre,
+			       &expr->where, msg);
+      free (msg);
+    }
+
   gfc_init_se (&lse, NULL);
   gfc_init_se (&rse, NULL);
 
@@ -4807,6 +4839,27 @@ class_array_fcn:
   else
     parmse->expr = gfc_build_addr_expr (NULL_TREE, parmse->expr);
 
+  /* Wrap the above in "if (present(x))" if needed.  */
+
+  if (pass_optional)
+    {
+      tree present;
+      tree type;
+      tree parmse_expr;
+      stmtblock_t block;
+
+      type = TREE_TYPE (parmse->expr);
+      gfc_start_block (&block);
+      gfc_add_block_to_block (&block, &parmse->pre);
+      gfc_add_block_to_block (&block, &parmse->post);
+      parmse_expr = gfc_finish_block (&block);
+
+      present = gfc_conv_expr_present (sym);
+      tmp = fold_build3_loc (input_location, COND_EXPR, type, present,
+			     parmse_expr, build_int_cst (type, 0));
+      se->expr = tmp;
+    }
+
   return;
 }
 
Index: fortran/trans.h
===================================================================
--- fortran/trans.h	(Revision 268104)
+++ fortran/trans.h	(Arbeitskopie)
@@ -529,7 +529,10 @@ int gfc_is_intrinsic_libcall (gfc_expr *);
 int gfc_conv_procedure_call (gfc_se *, gfc_symbol *, gfc_actual_arglist *,
 			     gfc_expr *, vec<tree, va_gc> *);
 
-void gfc_conv_subref_array_arg (gfc_se *, gfc_expr *, int, sym_intent, bool);
+void gfc_conv_subref_array_arg (gfc_se *, gfc_expr *, int, sym_intent, bool,
+				const gfc_symbol *fsym = NULL,
+				const char *proc_name = NULL,
+				gfc_symbol *sym = NULL);
 
 /* Generate code for a scalar assignment.  */
 tree gfc_trans_scalar_assign (gfc_se *, gfc_se *, gfc_typespec, bool, bool,
Index: testsuite/gfortran.dg/alloc_comp_auto_array_3.f90
===================================================================
--- testsuite/gfortran.dg/alloc_comp_auto_array_3.f90	(Revision 268104)
+++ testsuite/gfortran.dg/alloc_comp_auto_array_3.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR66082. The original problem was with the first
 ! call foo_1d.
Index: testsuite/gfortran.dg/assumed_type_2.f90
===================================================================
--- testsuite/gfortran.dg/assumed_type_2.f90	(Revision 268104)
+++ testsuite/gfortran.dg/assumed_type_2.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/48820
 !
Index: testsuite/gfortran.dg/c_loc_test_22.f90
===================================================================
--- testsuite/gfortran.dg/c_loc_test_22.f90	(Revision 268104)
+++ testsuite/gfortran.dg/c_loc_test_22.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/56907
 !
Index: testsuite/gfortran.dg/contiguous_3.f90
===================================================================
--- testsuite/gfortran.dg/contiguous_3.f90	(Revision 268104)
+++ testsuite/gfortran.dg/contiguous_3.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/40632
 !
Index: testsuite/gfortran.dg/internal_pack_11.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_11.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_11.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR43173, where unnecessary calls to internal_pack/unpack
 ! were being produced below. These references are contiguous and so do not
Index: testsuite/gfortran.dg/internal_pack_12.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_12.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_12.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR43243, where unnecessary calls to internal_pack/unpack
 ! were being produced below. These references are contiguous and so do not
Index: testsuite/gfortran.dg/internal_pack_16.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_16.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_16.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-O0 -fdump-tree-original" }
 ! PR 59345 - pack/unpack was not needed here.
 SUBROUTINE S1(A)
  REAL :: A(3)
Index: testsuite/gfortran.dg/internal_pack_17.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_17.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_17.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-O0 -fdump-tree-original" }
 ! PR 59345 - pack/unpack was not needed here.
 ! Original test case by Joost VandeVondele 
 SUBROUTINE S1(A)
Index: testsuite/gfortran.dg/internal_pack_18.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_18.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_18.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-O0 -fdump-tree-original" }
 ! PR 57992 - this was packed/unpacked unnecessarily.
 ! Original case by Tobias Burnus.
 subroutine test
Index: testsuite/gfortran.dg/internal_pack_4.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_4.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_4.f90	(Arbeitskopie)
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-fdump-tree-original" }
 !
 ! PR fortran/36132
 !
@@ -25,6 +24,3 @@ END MODULE M1
 USE M1
 CALL S2()
 END
-
-! { dg-final { scan-tree-dump-times "a != 0B \\? \\\(.*\\\) _gfortran_internal_pack" 1 "original" } }
-! { dg-final { scan-tree-dump-times "if \\(a != 0B &&" 1 "original" } }
Index: testsuite/gfortran.dg/internal_pack_5.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_5.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_5.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/36909
 !
Index: testsuite/gfortran.dg/internal_pack_6.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_6.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_6.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! Test the fix for PR41113 and PR41117, in which unnecessary calls
 ! to internal_pack and internal_unpack were being generated.
Index: testsuite/gfortran.dg/internal_pack_9.f90
===================================================================
--- testsuite/gfortran.dg/internal_pack_9.f90	(Revision 268104)
+++ testsuite/gfortran.dg/internal_pack_9.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-fdump-tree-original" }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! During the discussion of the fix for PR43072, in which unnecessary
 ! calls to internal PACK/UNPACK were being generated, the following,
Index: testsuite/gfortran.dg/missing_optional_dummy_6.f90
===================================================================
--- testsuite/gfortran.dg/missing_optional_dummy_6.f90	(Revision 268104)
+++ testsuite/gfortran.dg/missing_optional_dummy_6.f90	(Arbeitskopie)
@@ -46,14 +46,3 @@ contains
   end subroutine scalar2
 
 end program test
-
-! { dg-final { scan-tree-dump-times "scalar2 \\(slr1" 1 "original" } }
-
-! { dg-final { scan-tree-dump-times "= es1 != 0B" 1 "original" } }
-! { dg-final { scan-tree-dump-times "assumed_shape2 \\(es1" 0 "original" } }
-! { dg-final { scan-tree-dump-times "explicit_shape2 \\(es1" 1 "original" } }
-
-! { dg-final { scan-tree-dump-times "= as1 != 0B" 2 "original" } }
-! { dg-final { scan-tree-dump-times "assumed_shape2 \\(as1" 0 "original" } }
-! { dg-final { scan-tree-dump-times "explicit_shape2 \\(as1" 0 "original" } }
-
Index: testsuite/gfortran.dg/no_arg_check_2.f90
===================================================================
--- testsuite/gfortran.dg/no_arg_check_2.f90	(Revision 268104)
+++ testsuite/gfortran.dg/no_arg_check_2.f90	(Arbeitskopie)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/39505
 ! 
Index: testsuite/gfortran.dg/typebound_assignment_5.f03
===================================================================
--- testsuite/gfortran.dg/typebound_assignment_5.f03	(Revision 268104)
+++ testsuite/gfortran.dg/typebound_assignment_5.f03	(Arbeitskopie)
@@ -1,5 +1,5 @@
-! { dg-do run }
-! { dg-options "-fdump-tree-original" }
+! { dg-do compile }
+! { dg-options "-O0 -fdump-tree-original" }
 !
 ! PR fortran/49074
 ! ICE on defined assignment with class arrays.
Index: testsuite/gfortran.dg/typebound_assignment_6.f03
===================================================================
--- testsuite/gfortran.dg/typebound_assignment_6.f03	(Revision 268104)
+++ testsuite/gfortran.dg/typebound_assignment_6.f03	(Arbeitskopie)
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-fdump-tree-original" }
 !
 ! PR fortran/56136
 ! ICE on defined assignment with class arrays.
@@ -37,6 +36,3 @@
         IF (ANY(A(2::2)%I /= (/ ((50+2*I, I=1,SIZE(A)/4), J=1,2) /))) STOP 3
       END PROGRAM
 
-! { dg-final { scan-tree-dump-times "_gfortran_internal_pack" 1 "original" } }
-! { dg-final { scan-tree-dump-times "_gfortran_internal_unpack" 1 "original" } }
-

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, fortran] Move some array packing to front end
  2019-01-23 15:22 Dominique d'Humières
@ 2019-01-23 17:27 ` Thomas Koenig
  2019-01-24 11:01   ` Richard Biener
  2019-01-23 21:41 ` Thomas Koenig
  1 sibling, 1 reply; 7+ messages in thread
From: Thomas Koenig @ 2019-01-23 17:27 UTC (permalink / raw)
  To: Dominique d'Humières; +Cc: gfortran, gcc-patches

Hi Dominique,

> FAIL: gfortran.dg/internal_pack_4.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> 
> with -m32.
> 
> gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/internal_pack_4.f90 -O3 -funroll-loops -ftracer -m32
> 
> is enough to trigger the miscomputation.

Thanks, I will look into it.

> The changes in the test suite are quite messy and I hope I did not miss any test (you should do "diff -N …" for the new tests).

I don't think this is a good idea.  Applying the patch twice will then
double the test case.


> Do you have test showing a speed-up?

It' in the PR.


> I agree with Richard that this patch should be held until the next stage 1.

I just realized that we are, in principle, in regression-only mode.

I do wish the announcements were also made to the fortran mailing lists.

Anyway, I'll see if I can fix that bug, then attach the combined
patch to the PR for later inclusion.

Regards

	Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch, fortran] Move some array packing to front end
@ 2019-01-23 15:22 Dominique d'Humières
  2019-01-23 17:27 ` Thomas Koenig
  2019-01-23 21:41 ` Thomas Koenig
  0 siblings, 2 replies; 7+ messages in thread
From: Dominique d'Humières @ 2019-01-23 15:22 UTC (permalink / raw)
  To: Thomas Koenig; +Cc: gfortran, gcc-patches

Hi Thomas,

With your patch I see

FAIL: gfortran.dg/internal_pack_4.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test

with -m32.

gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/internal_pack_4.f90 -O3 -funroll-loops -ftracer -m32

is enough to trigger the miscomputation.

The changes in the test suite are quite messy and I hope I did not miss any test (you should do "diff -N …" for the new tests).

Do you have test showing a speed-up?

I agree with Richard that this patch should be held until the next stage 1.

Thanks for this work.

Dominique

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-02-07  6:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-22 21:10 [patch, fortran] Move some array packing to front end Thomas Koenig
2019-01-23  8:22 ` Richard Biener
2019-01-23 15:22 Dominique d'Humières
2019-01-23 17:27 ` Thomas Koenig
2019-01-24 11:01   ` Richard Biener
2019-02-07  6:31     ` Chris Elrod
2019-01-23 21:41 ` Thomas Koenig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).