* [FORTRAN PATCH] Improved scalarization of constant array constructors
@ 2007-01-08 4:15 Roger Sayle
0 siblings, 0 replies; 6+ messages in thread
From: Roger Sayle @ 2007-01-08 4:15 UTC (permalink / raw)
To: fortran, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 4373 bytes --]
This patch improves the way that gfortran expands constant array
constructors to GIMPLE. It's probably best explained via a simple
example; consider the following code:
program v
integer :: x(4)
x(:) = (/ 3, 1, 4, 1 /)
end program
Currently, this is expanded by the gfortran front-end as:
MAIN__ ()
{
int4 x[4];
_gfortran_set_std (70, 127, 0);
{
int4 A.1[4];
struct array1_int4 atmp.0;
atmp.0.dtype = 265;
atmp.0.dim[0].stride = 1;
atmp.0.dim[0].lbound = 0;
atmp.0.dim[0].ubound = 3;
atmp.0.data = (void *) &A.1;
atmp.0.offset = 0;
{
static int4 data.3[4] = {3, 1, 4, 1};
__builtin_memcpy (&(*(int4[0:] *) atmp.0.data)[0], &data.3, 16);
}
{
int8 S.4;
S.4 = 0;
while (1)
{
if (S.4 > 3) goto L.1;
x[NON_LVALUE_EXPR <S.4>] = (*(int4[0:] *)
atmp.0.data)[NON_LVALUE_EXPR <S.4>];
S.4 = S.4 + 1;
}
L.1:;
}
}
}
Notice that we create a temporary array A, that requires an array
descriptor, that we then populate via a memcpy call, and then use this in
the scalarization of the assignment. Hence we require three arrays; "A",
"data" and "x", and two copies.
With the patch below, we now generate the much simpler:
MAIN__ ()
{
int4 x[4];
_gfortran_set_std (70, 127, 0);
{
static int4 A.0[4] = {3, 1, 4, 1};
{
int8 S.1;
S.1 = 1;
while (1)
{
if (S.1 > 4) goto L.1;
x[S.1 + -1] = A.0[S.1 + -1];
S.1 = S.1 + 1;
}
L.1:;
}
}
}
where the array A, is initialized as a "static const", and then used
directly during the scalarization. This requires only two arrays and
a single copy. When the array constructor is large, this can result
in significant time and space savings, for the common case.
The optimization is that when the array constructor consists entirely of
constant elements, that specify it's entire size, it's possible to
pre-initialize "A" if we use gfortran's notion of a non-descriptor array
type.
Some additional comments.
[1] The code is structured so that follow-up patches can re-use this code
to lower EXPR_ARRAY expressions outside of the scalarizer, for example
allowing us to use __builtin_memcpy to move "A" into "x" above.
[2] The patch also contains a small, but unrelated clean-up, to
gfc_trans_constructor, that tidies up the scope/use of the variable
CONST_STRING. Investigation discovered that we set const_string even
when the assignment was dead, and reorganizing things slightly makes the
role of this variable slightly clearer.
[3] This patch also contains another NON_LVALUE_EXPR tweak, which explains
the elimination/simplificiation of the "NON_LVALUE_EXPR <S.4>"
subexpressions in the example code above.
[4] This change actually allows us to do slightly better on the testcase
gfortran.dg/vect/vect-5.f90. Previously, we expected to find two
unaligned array accesses, though now, because we scalarize references as
"A" rather than via "atmp.0.data", the vectorizer can tell that they are
suitably aligned, and hence we only find one unaligned array access.
The following patch has been tested against mainline with a full "make
bootstrap", including gfortran, and regression tested with a top-level
"make -k check" with no new failures.
Ok for mainline?
2007-01-07 Roger Sayle <roger@eyesopen.com>
* trans-array.c (constant_array_constructor_p): New function to
determine whether an array constructor consists only of constant
elements, and if so return its size.
(gfc_build_constant_array_constructor): Construct a statically
initialized gfortran array for a given EXPR_ARRAY.
(gfc_trans_constant_array_constructor): Efficiently scalarize
a constant array constructor.
(gfc_trans_array_constructor): Tidy up use of CONST_STRING.
Special case scalarization of constant array constructors, all of
whose elements are specified, using constant_array_constructor_p
and gfc_trans_constant_array_constructor.
(gfc_conv_scalarized_array_ref): Check whetger info->offset is zero
before adding it to index, to avoid creating a NON_LVALUE_EXPR.
* gfortran.dg/array_constructor_14.f90: New test case.
* gfortran.dg/vect/vect-5.f90: Update test for improved alignment.
Roger
--
[-- Attachment #2: patchf.txt --]
[-- Type: text/plain, Size: 6661 bytes --]
Index: trans-array.c
===================================================================
*** trans-array.c (revision 120503)
--- trans-array.c (working copy)
*************** get_array_ctor_strlen (gfc_constructor *
*** 1463,1468 ****
--- 1463,1581 ----
return is_const;
}
+ /* Check whether the array constructor C consists entirely of constant
+ elements, and if so returns the number of those elements, otherwise
+ return zero. Note, an empty or NULL array constructor returns zero. */
+
+ static unsigned HOST_WIDE_INT
+ constant_array_constructor_p (gfc_constructor * c)
+ {
+ unsigned HOST_WIDE_INT nelem = 0;
+
+ while (c)
+ {
+ if (c->iterator
+ || c->expr->rank > 0
+ || c->expr->expr_type != EXPR_CONSTANT)
+ return 0;
+ c = c->next;
+ nelem++;
+ }
+ return nelem;
+ }
+
+
+ /* Given EXPR, the constant array constructor specified by an EXPR_ARRAY,
+ and the tree type of it's elements, TYPE, return a static constant
+ variable that is compile-time initialized. */
+
+ static tree
+ gfc_build_constant_array_constructor (gfc_expr * expr, tree type)
+ {
+ tree tmptype, list, init, tmp;
+ HOST_WIDE_INT nelem;
+ gfc_constructor *c;
+ gfc_array_spec as;
+ gfc_se se;
+
+
+ /* First traverse the constructor list, converting the constants
+ to tree to build an initializer. */
+ nelem = 0;
+ list = NULL_TREE;
+ c = expr->value.constructor;
+ while (c)
+ {
+ gfc_init_se (&se, NULL);
+ gfc_conv_constant (&se, c->expr);
+ if (c->expr->ts.type == BT_CHARACTER
+ && POINTER_TYPE_P (type))
+ se.expr = gfc_build_addr_expr (pchar_type_node, se.expr);
+ list = tree_cons (NULL_TREE, se.expr, list);
+ c = c->next;
+ nelem++;
+ }
+
+ /* Next detemine the tree type for the array. We use the gfortran
+ front-end's gfc_get_nodesc_array_type in order to create a suitable
+ GFC_ARRAY_TYPE_P that may be used by the scalarizer. */
+
+ memset (&as, 0, sizeof (gfc_array_spec));
+
+ as.rank = 1;
+ as.type = AS_EXPLICIT;
+ as.lower[0] = gfc_int_expr (0);
+ as.upper[0] = gfc_int_expr (nelem - 1);
+ tmptype = gfc_get_nodesc_array_type (type, &as, 3);
+
+ init = build_constructor_from_list (tmptype, nreverse (list));
+
+ TREE_CONSTANT (init) = 1;
+ TREE_INVARIANT (init) = 1;
+ TREE_STATIC (init) = 1;
+
+ tmp = gfc_create_var (tmptype, "A");
+ TREE_STATIC (tmp) = 1;
+ TREE_CONSTANT (tmp) = 1;
+ TREE_INVARIANT (tmp) = 1;
+ TREE_READONLY (tmp) = 1;
+ DECL_INITIAL (tmp) = init;
+
+ return tmp;
+ }
+
+
+ /* Translate a constant EXPR_ARRAY array constructor for the scalarizer.
+ This mostly initializes the scalarizer state info structure with the
+ appropriate values to directly use the array created by the function
+ gfc_build_constant_array_constructor. */
+
+ static void
+ gfc_trans_constant_array_constructor (gfc_loopinfo * loop,
+ gfc_ss * ss, tree type)
+ {
+ gfc_ss_info *info;
+ tree tmp;
+
+ tmp = gfc_build_constant_array_constructor (ss->expr, type);
+
+ info = &ss->data.info;
+
+ info->descriptor = tmp;
+ info->data = build_fold_addr_expr (tmp);
+ info->offset = fold_build1 (NEGATE_EXPR, gfc_array_index_type,
+ loop->from[0]);
+
+ info->delta[0] = gfc_index_zero_node;
+ info->start[0] = gfc_index_zero_node;
+ info->end[0] = gfc_index_zero_node;
+ info->stride[0] = gfc_index_one_node;
+ info->dim[0] = 0;
+
+ if (info->dimen > loop->temp_dim)
+ loop->temp_dim = info->dimen;
+ }
+
/* Array constructors are handled by constructing a temporary, then using that
within the scalarization loop. This is not optimal, but seems by far the
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1476,1482 ****
tree offsetvar;
tree desc;
tree type;
- bool const_string;
bool dynamic;
ss->data.info.dimen = loop->dimen;
--- 1589,1594 ----
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1484,1490 ****
c = ss->expr->value.constructor;
if (ss->expr->ts.type == BT_CHARACTER)
{
! const_string = get_array_ctor_strlen (c, &ss->string_length);
if (!ss->string_length)
gfc_todo_error ("complex character array constructors");
--- 1596,1602 ----
c = ss->expr->value.constructor;
if (ss->expr->ts.type == BT_CHARACTER)
{
! bool const_string = get_array_ctor_strlen (c, &ss->string_length);
if (!ss->string_length)
gfc_todo_error ("complex character array constructors");
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1493,1502 ****
type = build_pointer_type (type);
}
else
! {
! const_string = TRUE;
! type = gfc_typenode_for_spec (&ss->expr->ts);
! }
/* See if the constructor determines the loop bounds. */
dynamic = false;
--- 1605,1611 ----
type = build_pointer_type (type);
}
else
! type = gfc_typenode_for_spec (&ss->expr->ts);
/* See if the constructor determines the loop bounds. */
dynamic = false;
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1518,1523 ****
--- 1627,1651 ----
mpz_clear (size);
}
+ /* Special case constant array constructors. */
+ if (!dynamic
+ && loop->dimen == 1
+ && INTEGER_CST_P (loop->from[0])
+ && INTEGER_CST_P (loop->to[0]))
+ {
+ unsigned HOST_WIDE_INT nelem = constant_array_constructor_p (c);
+ if (nelem > 0)
+ {
+ tree diff = fold_build2 (MINUS_EXPR, gfc_array_index_type,
+ loop->to[0], loop->from[0]);
+ if (compare_tree_int (diff, nelem - 1) == 0)
+ {
+ gfc_trans_constant_array_constructor (loop, ss, type);
+ return;
+ }
+ }
+ }
+
gfc_trans_create_temp_array (&loop->pre, &loop->post, loop, &ss->data.info,
type, dynamic, true, false, false);
*************** gfc_conv_scalarized_array_ref (gfc_se *
*** 2045,2051 ****
info->stride0);
/* Add the offset for this dimension to the stored offset for all other
dimensions. */
! index = fold_build2 (PLUS_EXPR, gfc_array_index_type, index, info->offset);
tmp = build_fold_indirect_ref (info->data);
se->expr = gfc_build_array_ref (tmp, index);
--- 2173,2180 ----
info->stride0);
/* Add the offset for this dimension to the stored offset for all other
dimensions. */
! if (!integer_zerop (info->offset))
! index = fold_build2 (PLUS_EXPR, gfc_array_index_type, index, info->offset);
tmp = build_fold_indirect_ref (info->data);
se->expr = gfc_build_array_ref (tmp, index);
[-- Attachment #3: array_constructor_14.f90 --]
[-- Type: text/plain, Size: 325 bytes --]
! { dg-do compile }
! { dg-options "-O2 -fdump-tree-original" }
subroutine foo(x)
integer :: x(4)
x(:) = (/ 3, 1, 4, 1 /)
end subroutine
subroutine bar(x)
integer :: x(4)
x = (/ 3, 1, 4, 1 /)
end subroutine
! { dg-final { scan-tree-dump-times "data" 0 "original" } }
! { dg-final { cleanup-tree-dump "original" } }
[-- Attachment #4: testf.txt --]
[-- Type: text/plain, Size: 1289 bytes --]
Index: gfortran.dg/vect/vect-5.f90
===================================================================
*** gfortran.dg/vect/vect-5.f90 (revision 120503)
--- gfortran.dg/vect/vect-5.f90 (working copy)
***************
*** 37,43 ****
! { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } }
! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align } } } }
! ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }
! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target { ilp32 && vect_no_align } } } }
! We also expect to vectorize one loop for lp64 targets that support
--- 37,43 ----
! { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } }
! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align } } } }
! ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } }
! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target { ilp32 && vect_no_align } } } }
! We also expect to vectorize one loop for lp64 targets that support
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [FORTRAN PATCH] Improved scalarization of constant array constructors
2007-01-09 15:11 ` Jack Howarth
@ 2007-01-09 16:42 ` Paul Richard Thomas
0 siblings, 0 replies; 6+ messages in thread
From: Paul Richard Thomas @ 2007-01-09 16:42 UTC (permalink / raw)
To: Jack Howarth; +Cc: Paul Thomas, roger, fortran, gcc-patches
Yes, ...and SPEC2006 too.
HJ Lu does the SPEC2006 and... Richard Guenther, or somebody in the
SUSE outfit, does both.
Paul
On 1/9/07, Jack Howarth <howarth@bromo.msbb.uc.edu> wrote:
> Paul,
> Someone is running an automated test system for gcc trunk
> against the polyhedron benchmark, no? Shouldn't we be able
> to just watch for any changes when this patch goes into
> gcc trunk?
> Jack
>
> On Tue, Jan 09, 2007 at 10:44:11AM +0100, Paul Thomas wrote:
> > Jack,
> > >Paul,
> > > Is this patch well defined enough that it might be safe for gcc 4.2
> > >branch as well? If it provides a significant performance improvement
> > >that might merit the risk.
> > > Jack
> > >
> > >
> > If it makes a noticable difference to Polyhedron/SPEC2006 and does not
> > damage anything on trunk over a ~1 week timescale, yes I would say that
> > it would be worth it. I do not think that there is any risk on this one.
> >
> > Best regards
> >
> > Paul
> >
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [FORTRAN PATCH] Improved scalarization of constant array constructors
2007-01-09 9:44 ` Paul Thomas
@ 2007-01-09 15:11 ` Jack Howarth
2007-01-09 16:42 ` Paul Richard Thomas
0 siblings, 1 reply; 6+ messages in thread
From: Jack Howarth @ 2007-01-09 15:11 UTC (permalink / raw)
To: Paul Thomas; +Cc: Paul Richard Thomas, roger, fortran, gcc-patches
Paul,
Someone is running an automated test system for gcc trunk
against the polyhedron benchmark, no? Shouldn't we be able
to just watch for any changes when this patch goes into
gcc trunk?
Jack
On Tue, Jan 09, 2007 at 10:44:11AM +0100, Paul Thomas wrote:
> Jack,
> >Paul,
> > Is this patch well defined enough that it might be safe for gcc 4.2
> >branch as well? If it provides a significant performance improvement
> >that might merit the risk.
> > Jack
> >
> >
> If it makes a noticable difference to Polyhedron/SPEC2006 and does not
> damage anything on trunk over a ~1 week timescale, yes I would say that
> it would be worth it. I do not think that there is any risk on this one.
>
> Best regards
>
> Paul
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [FORTRAN PATCH] Improved scalarization of constant array constructors
2007-01-09 1:01 ` Jack Howarth
@ 2007-01-09 9:44 ` Paul Thomas
2007-01-09 15:11 ` Jack Howarth
0 siblings, 1 reply; 6+ messages in thread
From: Paul Thomas @ 2007-01-09 9:44 UTC (permalink / raw)
To: Jack Howarth; +Cc: Paul Richard Thomas, roger, fortran, gcc-patches
Jack,
> Paul,
> Is this patch well defined enough that it might be safe for gcc 4.2
> branch as well? If it provides a significant performance improvement
> that might merit the risk.
> Jack
>
>
If it makes a noticable difference to Polyhedron/SPEC2006 and does not
damage anything on trunk over a ~1 week timescale, yes I would say that
it would be worth it. I do not think that there is any risk on this one.
Best regards
Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [FORTRAN PATCH] Improved scalarization of constant array constructors
2007-01-08 9:23 Paul Richard Thomas
@ 2007-01-09 1:01 ` Jack Howarth
2007-01-09 9:44 ` Paul Thomas
0 siblings, 1 reply; 6+ messages in thread
From: Jack Howarth @ 2007-01-09 1:01 UTC (permalink / raw)
To: Paul Richard Thomas; +Cc: roger, fortran, gcc-patches
Paul,
Is this patch well defined enough that it might be safe for gcc 4.2
branch as well? If it provides a significant performance improvement
that might merit the risk.
Jack
On Mon, Jan 08, 2007 at 10:23:50AM +0100, Paul Richard Thomas wrote:
> Roger,
>
> Thank you, as usual, for a nicely prepared and explained patch.
>
> >Ok for mainline?
>
> I regtested it on Cygwin_NT/PIV - OK
>
> Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* [FORTRAN PATCH] Improved scalarization of constant array constructors
@ 2007-01-08 9:23 Paul Richard Thomas
2007-01-09 1:01 ` Jack Howarth
0 siblings, 1 reply; 6+ messages in thread
From: Paul Richard Thomas @ 2007-01-08 9:23 UTC (permalink / raw)
To: roger; +Cc: fortran, gcc-patches
Roger,
Thank you, as usual, for a nicely prepared and explained patch.
> Ok for mainline?
I regtested it on Cygwin_NT/PIV - OK
Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-01-09 16:42 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-08 4:15 [FORTRAN PATCH] Improved scalarization of constant array constructors Roger Sayle
2007-01-08 9:23 Paul Richard Thomas
2007-01-09 1:01 ` Jack Howarth
2007-01-09 9:44 ` Paul Thomas
2007-01-09 15:11 ` Jack Howarth
2007-01-09 16:42 ` Paul Richard Thomas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).