From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 122916 invoked by alias); 13 Jul 2015 14:56:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 122850 invoked by uid 89); 13 Jul 2015 14:56:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 13 Jul 2015 14:56:02 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33065) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ZEf9D-0005f1-II for gcc-patches@gnu.org; Mon, 13 Jul 2015 10:55:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZEf98-0003B8-Tb for gcc-patches@gnu.org; Mon, 13 Jul 2015 10:55:59 -0400 Received: from relay1.mentorg.com ([192.94.38.131]:56842) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZEf98-0003A2-Dm for gcc-patches@gnu.org; Mon, 13 Jul 2015 10:55:54 -0400 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-03.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZEf96-0005Cu-Ff from Tom_deVries@mentor.com for gcc-patches@gnu.org; Mon, 13 Jul 2015 07:55:53 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-03.mgc.mentorg.com (137.202.0.108) with Microsoft SMTP Server id 14.3.224.2; Mon, 13 Jul 2015 15:55:51 +0100 Message-ID: <55A3D170.6080304@mentor.com> Date: Mon, 13 Jul 2015 14:56:00 -0000 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: "gcc-patches@gnu.org" Subject: [PATCH, PR66851] Handle double reduction in parloops Content-Type: multipart/mixed; boundary="------------060802030606020509020504" X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 X-SW-Source: 2015-07/txt/msg01057.txt.bz2 --------------060802030606020509020504 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Content-length: 447 Hi, this patch fixes PR66851. In parloops, we manage to parallelize outer loops, but not if the inner loop contains a reduction. There is an xfail in autopar/outer-4.c for this: ... /* { dg-final { scan-tree-dump-times "parallelizing outer loop" 1 "parloops" { xfail *-*-* } } } */ ... This patch allows outer loops with a reduction in the inner loop to be parallelized. Bootstrapped and reg-tested on x86_64. OK for trunk? Thanks, - Tom --------------060802030606020509020504 Content-Type: text/x-patch; name="0001-Handle-double-reduction-in-parloops.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="0001-Handle-double-reduction-in-parloops.patch" Content-length: 7050 Handle double reduction in parloops 2015-07-13 Tom de Vries PR tree-optimization/66851 * tree-parloops.c (reduc_stmt_res): New function. (initialize_reductions, add_field_for_reduction) (create_phi_for_local_result, create_loads_for_reductions) (create_stores_for_reduction, build_new_reduction): Handle case that reduc_stmt is a phi. (gather_scalar_reductions): Allow double_reduc reductions. * gcc.dg/autopar/outer-4.c (parloop): Remove superfluous noinline attribute. Remove xfail on scan for parallelizing outer loop. (main): Remove. * testsuite/libgomp.c/outer-4.c: New test. --- gcc/testsuite/gcc.dg/autopar/outer-4.c | 17 ++++------------ gcc/tree-parloops.c | 37 +++++++++++++++++++++++++--------- libgomp/testsuite/libgomp.c/outer-4.c | 36 +++++++++++++++++++++++++++++++++ 3 files changed, 68 insertions(+), 22 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c/outer-4.c diff --git a/gcc/testsuite/gcc.dg/autopar/outer-4.c b/gcc/testsuite/gcc.dg/autopar/outer-4.c index 6fd37c5..f435080 100644 --- a/gcc/testsuite/gcc.dg/autopar/outer-4.c +++ b/gcc/testsuite/gcc.dg/autopar/outer-4.c @@ -6,15 +6,13 @@ void abort (void); int g_sum=0; int x[500][500]; -__attribute__((noinline)) -void parloop (int N) +void +parloop (int N) { int i, j; int sum; - /* Double reduction is currently not supported, outer loop is not - parallelized. Inner reduction is detected, inner loop is - parallelized. */ + /* Double reduction is detected, outer loop is parallelized. */ sum = 0; for (i = 0; i < N; i++) for (j = 0; j < N; j++) @@ -23,13 +21,6 @@ void parloop (int N) g_sum = sum; } -int main(void) -{ - parloop(500); - - return 0; -} - -/* { dg-final { scan-tree-dump-times "parallelizing outer loop" 1 "parloops" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "parallelizing outer loop" 1 "parloops" } } */ /* { dg-final { scan-tree-dump-times "loopfn" 4 "optimized" } } */ diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 21ed17b..db7da62 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -560,6 +560,14 @@ take_address_of (tree obj, tree type, edge entry, return name; } +static tree +reduc_stmt_res (gimple stmt) +{ + return (gimple_code (stmt) == GIMPLE_PHI + ? gimple_phi_result (stmt) + : gimple_assign_lhs (stmt)); +} + /* Callback for htab_traverse. Create the initialization statement for reduction described in SLOT, and place it at the preheader of the loop described in DATA. */ @@ -586,7 +594,7 @@ initialize_reductions (reduction_info **slot, struct loop *loop) c = build_omp_clause (gimple_location (reduc->reduc_stmt), OMP_CLAUSE_REDUCTION); OMP_CLAUSE_REDUCTION_CODE (c) = reduc->reduction_code; - OMP_CLAUSE_DECL (c) = SSA_NAME_VAR (gimple_assign_lhs (reduc->reduc_stmt)); + OMP_CLAUSE_DECL (c) = SSA_NAME_VAR (reduc_stmt_res (reduc->reduc_stmt)); init = omp_reduction_init (c, TREE_TYPE (bvar)); reduc->init = init; @@ -993,7 +1001,7 @@ add_field_for_reduction (reduction_info **slot, tree type) { struct reduction_info *const red = *slot; - tree var = gimple_assign_lhs (red->reduc_stmt); + tree var = reduc_stmt_res (red->reduc_stmt); tree field = build_decl (gimple_location (red->reduc_stmt), FIELD_DECL, SSA_NAME_IDENTIFIER (var), TREE_TYPE (var)); @@ -1053,12 +1061,12 @@ create_phi_for_local_result (reduction_info **slot, struct loop *loop) e = EDGE_PRED (store_bb, 1); else e = EDGE_PRED (store_bb, 0); - local_res = copy_ssa_name (gimple_assign_lhs (reduc->reduc_stmt)); + tree lhs = reduc_stmt_res (reduc->reduc_stmt); + local_res = copy_ssa_name (lhs); locus = gimple_location (reduc->reduc_stmt); new_phi = create_phi_node (local_res, store_bb); add_phi_arg (new_phi, reduc->init, e, locus); - add_phi_arg (new_phi, gimple_assign_lhs (reduc->reduc_stmt), - FALLTHRU_EDGE (loop->latch), locus); + add_phi_arg (new_phi, lhs, FALLTHRU_EDGE (loop->latch), locus); reduc->new_phi = new_phi; return 1; @@ -1151,7 +1159,7 @@ create_loads_for_reductions (reduction_info **slot, struct clsn_data *clsn_data) struct reduction_info *const red = *slot; gimple stmt; gimple_stmt_iterator gsi; - tree type = TREE_TYPE (gimple_assign_lhs (red->reduc_stmt)); + tree type = TREE_TYPE (reduc_stmt_res (red->reduc_stmt)); tree load_struct; tree name; tree x; @@ -1212,7 +1220,7 @@ create_stores_for_reduction (reduction_info **slot, struct clsn_data *clsn_data) tree t; gimple stmt; gimple_stmt_iterator gsi; - tree type = TREE_TYPE (gimple_assign_lhs (red->reduc_stmt)); + tree type = TREE_TYPE (reduc_stmt_res (red->reduc_stmt)); gsi = gsi_last_bb (clsn_data->store_bb); t = build3 (COMPONENT_REF, type, clsn_data->store, red->field, NULL_TREE); @@ -2321,6 +2329,7 @@ build_new_reduction (reduction_info_table_type *reduction_list, { reduction_info **slot; struct reduction_info *new_reduction; + enum tree_code reduction_code; gcc_assert (reduc_stmt); @@ -2332,12 +2341,22 @@ build_new_reduction (reduction_info_table_type *reduction_list, fprintf (dump_file, "\n"); } + if (gimple_code (reduc_stmt) == GIMPLE_PHI) + { + tree op1 = PHI_ARG_DEF (reduc_stmt, 0); + gimple def1 = SSA_NAME_DEF_STMT (op1); + reduction_code = gimple_assign_rhs_code (def1); + } + + else + reduction_code = gimple_assign_rhs_code (reduc_stmt); + new_reduction = XCNEW (struct reduction_info); new_reduction->reduc_stmt = reduc_stmt; new_reduction->reduc_phi = phi; new_reduction->reduc_version = SSA_NAME_VERSION (gimple_phi_result (phi)); - new_reduction->reduction_code = gimple_assign_rhs_code (reduc_stmt); + new_reduction->reduction_code = reduction_code; slot = reduction_list->find_slot (new_reduction, INSERT); *slot = new_reduction; } @@ -2378,7 +2397,7 @@ gather_scalar_reductions (loop_p loop, reduction_info_table_type *reduction_list gimple reduc_stmt = vect_force_simple_reduction (simple_loop_info, phi, true, &double_reduc); - if (reduc_stmt && !double_reduc) + if (reduc_stmt) build_new_reduction (reduction_list, reduc_stmt, phi); } } diff --git a/libgomp/testsuite/libgomp.c/outer-4.c b/libgomp/testsuite/libgomp.c/outer-4.c new file mode 100644 index 0000000..f77f634 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/outer-4.c @@ -0,0 +1,36 @@ +/* { dg-do run } */ +/* { dg-additional-options "-ftree-parallelize-loops=2" } */ + +void abort (void); + +int g_sum = 1; + +int x[500][500]; + +void __attribute__((noinline,noclone)) +parloop (int N) +{ + int i, j; + int sum; + + /* Double reduction is detected, outer loop is parallelized. */ + sum = 0; + for (i = 0; i < N; i++) + for (j = 0; j < N; j++) + sum += x[i][j]; + + g_sum = sum; +} + +int +main (void) +{ + x[234][432] = 2; + + parloop (500); + + if (g_sum != 2) + abort (); + + return 0; +} -- 1.9.1 --------------060802030606020509020504--