public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [FORTRAN PATCH] Improved scalarization of constant array constructors
@ 2007-01-08  9:23 Paul Richard Thomas
  2007-01-09  1:01 ` Jack Howarth
  0 siblings, 1 reply; 6+ messages in thread
From: Paul Richard Thomas @ 2007-01-08  9:23 UTC (permalink / raw)
  To: roger; +Cc: fortran, gcc-patches

Roger,

Thank you, as usual, for a nicely prepared and explained patch.

> Ok for mainline?

I regtested it on Cygwin_NT/PIV - OK

Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread
* [FORTRAN PATCH] Improved scalarization of constant array       constructors
@ 2007-01-08  4:15 Roger Sayle
  0 siblings, 0 replies; 6+ messages in thread
From: Roger Sayle @ 2007-01-08  4:15 UTC (permalink / raw)
  To: fortran, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4373 bytes --]

This patch improves the way that gfortran expands constant array
constructors to GIMPLE.  It's probably best explained via a simple
example; consider the following code:

program v
  integer :: x(4)
  x(:) = (/ 3, 1, 4, 1 /)
end program

Currently, this is expanded by the gfortran front-end as:

MAIN__ ()
{
  int4 x[4];

  _gfortran_set_std (70, 127, 0);
  {
    int4 A.1[4];
    struct array1_int4 atmp.0;

    atmp.0.dtype = 265;
    atmp.0.dim[0].stride = 1;
    atmp.0.dim[0].lbound = 0;
    atmp.0.dim[0].ubound = 3;
    atmp.0.data = (void *) &A.1;
    atmp.0.offset = 0;
    {
      static int4 data.3[4] = {3, 1, 4, 1};

      __builtin_memcpy (&(*(int4[0:] *) atmp.0.data)[0], &data.3, 16);
    }
    {
      int8 S.4;

      S.4 = 0;
      while (1)
        {
          if (S.4 > 3) goto L.1;
          x[NON_LVALUE_EXPR <S.4>] = (*(int4[0:] *)
atmp.0.data)[NON_LVALUE_EXPR <S.4>];
          S.4 = S.4 + 1;
        }
      L.1:;
    }
  }
}

Notice that we create a temporary array A, that requires an array
descriptor, that we then populate via a memcpy call, and then use this in
the scalarization of the assignment.  Hence we require three arrays; "A",
"data" and "x", and two copies.

With the patch below, we now generate the much simpler:

MAIN__ ()
{
  int4 x[4];

  _gfortran_set_std (70, 127, 0);
  {
    static int4 A.0[4] = {3, 1, 4, 1};

    {
      int8 S.1;

      S.1 = 1;
      while (1)
        {
          if (S.1 > 4) goto L.1;
          x[S.1 + -1] = A.0[S.1 + -1];
          S.1 = S.1 + 1;
        }
      L.1:;
    }
  }
}

where the array A, is initialized as a "static const", and then used
directly during the scalarization.  This requires only two arrays and
a single copy.  When the array constructor is large, this can result
in significant time and space savings, for the common case.

The optimization is that when the array constructor consists entirely of
constant elements, that specify it's entire size, it's possible to
pre-initialize "A" if we use gfortran's notion of a non-descriptor array
type.

Some additional comments.

[1] The code is structured so that follow-up patches can re-use this code
to lower EXPR_ARRAY expressions outside of the scalarizer, for example
allowing us to use __builtin_memcpy to move "A" into "x" above.

[2] The patch also contains a small, but unrelated clean-up, to
gfc_trans_constructor, that tidies up the scope/use of the variable
CONST_STRING.  Investigation discovered that we set const_string even
when the assignment was dead, and reorganizing things slightly makes the
role of this variable slightly clearer.

[3] This patch also contains another NON_LVALUE_EXPR tweak, which explains
the elimination/simplificiation of the "NON_LVALUE_EXPR <S.4>"
subexpressions in the example code above.

[4] This change actually allows us to do slightly better on the testcase
gfortran.dg/vect/vect-5.f90.  Previously, we expected to find two
unaligned array accesses, though now, because we scalarize references as
"A" rather than via "atmp.0.data", the vectorizer can tell that they are
suitably aligned, and hence we only find one unaligned array access.


The following patch has been tested against mainline with a full "make
bootstrap", including gfortran, and regression tested with a top-level
"make -k check" with no new failures.

Ok for mainline?

2007-01-07  Roger Sayle  <roger@eyesopen.com>

        * trans-array.c (constant_array_constructor_p): New function to
        determine whether an array constructor consists only of constant
        elements, and if so return its size.
        (gfc_build_constant_array_constructor): Construct a statically
        initialized gfortran array for a given EXPR_ARRAY.
        (gfc_trans_constant_array_constructor): Efficiently scalarize
        a constant array constructor.
        (gfc_trans_array_constructor):  Tidy up use of CONST_STRING.
        Special case scalarization of constant array constructors, all of
        whose elements are specified, using constant_array_constructor_p
        and gfc_trans_constant_array_constructor.
        (gfc_conv_scalarized_array_ref): Check whetger info->offset is zero
        before adding it to index, to avoid creating a NON_LVALUE_EXPR.

        * gfortran.dg/array_constructor_14.f90: New test case.
        * gfortran.dg/vect/vect-5.f90: Update test for improved alignment.

Roger
--

[-- Attachment #2: patchf.txt --]
[-- Type: text/plain, Size: 6661 bytes --]

Index: trans-array.c
===================================================================
*** trans-array.c	(revision 120503)
--- trans-array.c	(working copy)
*************** get_array_ctor_strlen (gfc_constructor *
*** 1463,1468 ****
--- 1463,1581 ----
    return is_const;
  }
  
+ /* Check whether the array constructor C consists entirely of constant
+    elements, and if so returns the number of those elements, otherwise
+    return zero.  Note, an empty or NULL array constructor returns zero.  */
+ 
+ static unsigned HOST_WIDE_INT
+ constant_array_constructor_p (gfc_constructor * c)
+ {
+   unsigned HOST_WIDE_INT nelem = 0;
+ 
+   while (c)
+     {
+       if (c->iterator
+ 	  || c->expr->rank > 0
+ 	  || c->expr->expr_type != EXPR_CONSTANT)
+ 	return 0;
+       c = c->next;
+       nelem++;
+     }
+   return nelem;
+ }
+ 
+ 
+ /* Given EXPR, the constant array constructor specified by an EXPR_ARRAY,
+    and the tree type of it's elements, TYPE, return a static constant
+    variable that is compile-time initialized.  */
+ 
+ static tree
+ gfc_build_constant_array_constructor (gfc_expr * expr, tree type)
+ {
+   tree tmptype, list, init, tmp;
+   HOST_WIDE_INT nelem;
+   gfc_constructor *c;
+   gfc_array_spec as;
+   gfc_se se;
+ 
+ 
+   /* First traverse the constructor list, converting the constants
+      to tree to build an initializer.  */
+   nelem = 0;
+   list = NULL_TREE;
+   c = expr->value.constructor;
+   while (c)
+     {
+       gfc_init_se (&se, NULL);
+       gfc_conv_constant (&se, c->expr);
+       if (c->expr->ts.type == BT_CHARACTER
+ 	  && POINTER_TYPE_P (type))
+ 	se.expr = gfc_build_addr_expr (pchar_type_node, se.expr);
+       list = tree_cons (NULL_TREE, se.expr, list);
+       c = c->next;
+       nelem++;
+     }
+ 
+   /* Next detemine the tree type for the array.  We use the gfortran
+      front-end's gfc_get_nodesc_array_type in order to create a suitable
+      GFC_ARRAY_TYPE_P that may be used by the scalarizer.  */
+ 
+   memset (&as, 0, sizeof (gfc_array_spec));
+ 
+   as.rank = 1;
+   as.type = AS_EXPLICIT;
+   as.lower[0] = gfc_int_expr (0);
+   as.upper[0] = gfc_int_expr (nelem - 1);
+   tmptype = gfc_get_nodesc_array_type (type, &as, 3);
+ 
+   init = build_constructor_from_list (tmptype, nreverse (list));
+ 
+   TREE_CONSTANT (init) = 1;
+   TREE_INVARIANT (init) = 1;
+   TREE_STATIC (init) = 1;
+ 
+   tmp = gfc_create_var (tmptype, "A");
+   TREE_STATIC (tmp) = 1;
+   TREE_CONSTANT (tmp) = 1;
+   TREE_INVARIANT (tmp) = 1;
+   TREE_READONLY (tmp) = 1;
+   DECL_INITIAL (tmp) = init;
+ 
+   return tmp;
+ }
+ 
+ 
+ /* Translate a constant EXPR_ARRAY array constructor for the scalarizer.
+    This mostly initializes the scalarizer state info structure with the
+    appropriate values to directly use the array created by the function
+    gfc_build_constant_array_constructor.  */
+ 
+ static void
+ gfc_trans_constant_array_constructor (gfc_loopinfo * loop,
+ 				      gfc_ss * ss, tree type)
+ {
+   gfc_ss_info *info;
+   tree tmp;
+ 
+   tmp = gfc_build_constant_array_constructor (ss->expr, type);
+ 
+   info = &ss->data.info;
+ 
+   info->descriptor = tmp;
+   info->data = build_fold_addr_expr (tmp);
+   info->offset = fold_build1 (NEGATE_EXPR, gfc_array_index_type,
+ 			      loop->from[0]);
+ 
+   info->delta[0] = gfc_index_zero_node;
+   info->start[0] = gfc_index_zero_node;
+   info->end[0] = gfc_index_zero_node;
+   info->stride[0] = gfc_index_one_node;
+   info->dim[0] = 0;
+ 
+   if (info->dimen > loop->temp_dim)
+     loop->temp_dim = info->dimen;
+ }
+ 
  
  /* Array constructors are handled by constructing a temporary, then using that
     within the scalarization loop.  This is not optimal, but seems by far the
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1476,1482 ****
    tree offsetvar;
    tree desc;
    tree type;
-   bool const_string;
    bool dynamic;
  
    ss->data.info.dimen = loop->dimen;
--- 1589,1594 ----
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1484,1490 ****
    c = ss->expr->value.constructor;
    if (ss->expr->ts.type == BT_CHARACTER)
      {
!       const_string = get_array_ctor_strlen (c, &ss->string_length);
        if (!ss->string_length)
  	gfc_todo_error ("complex character array constructors");
  
--- 1596,1602 ----
    c = ss->expr->value.constructor;
    if (ss->expr->ts.type == BT_CHARACTER)
      {
!       bool const_string = get_array_ctor_strlen (c, &ss->string_length);
        if (!ss->string_length)
  	gfc_todo_error ("complex character array constructors");
  
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1493,1502 ****
  	type = build_pointer_type (type);
      }
    else
!     {
!       const_string = TRUE;
!       type = gfc_typenode_for_spec (&ss->expr->ts);
!     }
  
    /* See if the constructor determines the loop bounds.  */
    dynamic = false;
--- 1605,1611 ----
  	type = build_pointer_type (type);
      }
    else
!     type = gfc_typenode_for_spec (&ss->expr->ts);
  
    /* See if the constructor determines the loop bounds.  */
    dynamic = false;
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1518,1523 ****
--- 1627,1651 ----
        mpz_clear (size);
      }
  
+   /* Special case constant array constructors.  */
+   if (!dynamic
+       && loop->dimen == 1
+       && INTEGER_CST_P (loop->from[0])
+       && INTEGER_CST_P (loop->to[0]))
+     {
+       unsigned HOST_WIDE_INT nelem = constant_array_constructor_p (c);
+       if (nelem > 0)
+ 	{
+ 	  tree diff = fold_build2 (MINUS_EXPR, gfc_array_index_type,
+ 				   loop->to[0], loop->from[0]);
+ 	  if (compare_tree_int (diff, nelem - 1) == 0)
+ 	    {
+ 	      gfc_trans_constant_array_constructor (loop, ss, type);
+ 	      return;
+ 	    }
+ 	}
+     }
+ 
    gfc_trans_create_temp_array (&loop->pre, &loop->post, loop, &ss->data.info,
  			       type, dynamic, true, false, false);
  
*************** gfc_conv_scalarized_array_ref (gfc_se * 
*** 2045,2051 ****
  				       info->stride0);
    /* Add the offset for this dimension to the stored offset for all other
       dimensions.  */
!   index = fold_build2 (PLUS_EXPR, gfc_array_index_type, index, info->offset);
  
    tmp = build_fold_indirect_ref (info->data);
    se->expr = gfc_build_array_ref (tmp, index);
--- 2173,2180 ----
  				       info->stride0);
    /* Add the offset for this dimension to the stored offset for all other
       dimensions.  */
!   if (!integer_zerop (info->offset))
!     index = fold_build2 (PLUS_EXPR, gfc_array_index_type, index, info->offset);
  
    tmp = build_fold_indirect_ref (info->data);
    se->expr = gfc_build_array_ref (tmp, index);

[-- Attachment #3: array_constructor_14.f90 --]
[-- Type: text/plain, Size: 325 bytes --]

! { dg-do compile }
! { dg-options "-O2 -fdump-tree-original" }

subroutine foo(x)
  integer :: x(4)
  x(:) = (/ 3, 1, 4, 1 /)
end subroutine

subroutine bar(x)
  integer :: x(4)
  x = (/ 3, 1, 4, 1 /)
end subroutine

! { dg-final { scan-tree-dump-times "data" 0 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

[-- Attachment #4: testf.txt --]
[-- Type: text/plain, Size: 1289 bytes --]

Index: gfortran.dg/vect/vect-5.f90
===================================================================
*** gfortran.dg/vect/vect-5.f90	(revision 120503)
--- gfortran.dg/vect/vect-5.f90	(working copy)
***************
*** 37,43 ****
  
  ! { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } }
  ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align } } } }
! ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }
  ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target { ilp32 && vect_no_align } } } }
  
  ! We also expect to vectorize one loop for lp64 targets that support 
--- 37,43 ----
  
  ! { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } }
  ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align } } } }
! ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } }
  ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target { ilp32 && vect_no_align } } } }
  
  ! We also expect to vectorize one loop for lp64 targets that support 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-01-09 16:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-08  9:23 [FORTRAN PATCH] Improved scalarization of constant array constructors Paul Richard Thomas
2007-01-09  1:01 ` Jack Howarth
2007-01-09  9:44   ` Paul Thomas
2007-01-09 15:11     ` Jack Howarth
2007-01-09 16:42       ` Paul Richard Thomas
  -- strict thread matches above, loose matches on Subject: below --
2007-01-08  4:15 Roger Sayle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).