From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 88570 invoked by alias); 25 Jun 2015 07:44:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 88558 invoked by uid 89); 25 Jun 2015 07:44:04 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 25 Jun 2015 07:44:02 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53963) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1Z81pH-0002wu-QY for gcc-patches@gnu.org; Thu, 25 Jun 2015 03:43:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z81pC-00054J-9M for gcc-patches@gnu.org; Thu, 25 Jun 2015 03:43:59 -0400 Received: from relay1.mentorg.com ([192.94.38.131]:40398) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z81pC-000542-09 for gcc-patches@gnu.org; Thu, 25 Jun 2015 03:43:54 -0400 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1Z81p9-0002LF-4m from Tom_deVries@mentor.com ; Thu, 25 Jun 2015 00:43:51 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Thu, 25 Jun 2015 08:43:49 +0100 Message-ID: <558BB12B.7060108@mentor.com> Date: Thu, 25 Jun 2015 08:15:00 -0000 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Richard Biener CC: "gcc-patches@gnu.org" Subject: [PATCH, 2/2][PR66642] Add empty loop exit block in transform_to_exit_first_loop_alt Content-Type: multipart/mixed; boundary="------------020709070100050909020807" X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 X-SW-Source: 2015-06/txt/msg01769.txt.bz2 --------------020709070100050909020807 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1320 Hi, I ran into a failure with parloops for reduction loop testcase libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c. When we exercise the low iteration count loop, the test-case fails. To understand the problem, let's first look at what happens when we use transform_to_exit_first_loop (the original one) instead of transform_to_exit_first_loop_alt (the alternative one, which is currently used, and causing the failure). Before transform_to_exit_first_loop, the low iteration count loop and the main loop share the loop exit block. After transform_to_exit_first_loop, that's not the case anymore, the main loop now has an exit block with a single predecessor. Subsequently, separate_decls_in_region inserts code in the main loop exit block, which is only triggered upon exit of the main loop. However, transform_to_exit_first_loop_alt does not insert such an exit block, and the code inserted by separate_decls_in_region is also active for the low iteration count loop, which results in an incorrect reduction result when the low iteration count loop is used. This patch fixes the problem by making sure transform_to_exit_first_loop_alt adds a new exit block inbetween the main loop header and the old exit block. Bootstrapped and reg-tested on x86_64. OK for trunk? Thanks, - Tom --------------020709070100050909020807 Content-Type: text/x-patch; name="0002-Add-empty-loop-exit-block-in-transform_to_exit_first.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename*0="0002-Add-empty-loop-exit-block-in-transform_to_exit_first.pa"; filename*1="tch" Content-length: 6359 Add empty loop exit block in transform_to_exit_first_loop_alt 2015-06-24 Tom de Vries PR tree-optimization/66642 * tree-parloops.c (transform_to_exit_first_loop_alt): Update function header comment. Rename split_edge variable to edge_at_split. Split exit edge to create new loop exit bb. Insert loop exit phis in new loop exit bb. * testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c (main): Test low iteration count case. * testsuite/libgomp.c/parloops-exit-first-loop-alt.c (init): New function, factor out of ... (main): ... here. Test low iteration count case. --- gcc/tree-parloops.c | 45 ++++++++++++++++------ .../libgomp.c/parloops-exit-first-loop-alt-3.c | 5 +++ .../libgomp.c/parloops-exit-first-loop-alt.c | 28 +++++++++++++- 3 files changed, 64 insertions(+), 14 deletions(-) diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index df7c351..6c8aaab 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -1522,7 +1522,7 @@ replace_uses_in_bb_by (tree name, tree val, basic_block bb) goto : - sum_z = PHI + sum_z = PHI [1] Where is single_pred (bb latch); In the simplest case, that's . @@ -1549,14 +1549,17 @@ replace_uses_in_bb_by (tree name, tree val, basic_block bb) if (ivtmp_c < n + 1) goto ; else - goto ; + goto ; : ivtmp_b = ivtmp_a + 1; goto + : + sum_y = PHI + : - sum_z = PHI + sum_z = PHI In unified diff format: @@ -1593,9 +1596,12 @@ replace_uses_in_bb_by (tree name, tree val, basic_block bb) - goto + goto ++ : ++ sum_y = PHI + : -- sum_z = PHI -+ sum_z = PHI +- sum_z = PHI ++ sum_z = PHI Note: the example does not show any virtual phis, but these are handled more or less as reductions. @@ -1626,7 +1632,7 @@ transform_to_exit_first_loop_alt (struct loop *loop, /* Create the new_header block. */ basic_block new_header = split_block_before_cond_jump (exit->src); - edge split_edge = single_pred_edge (new_header); + edge edge_at_split = single_pred_edge (new_header); /* Redirect entry edge to new_header. */ edge entry = loop_preheader_edge (loop); @@ -1643,9 +1649,9 @@ transform_to_exit_first_loop_alt (struct loop *loop, e = redirect_edge_and_branch (post_cond_edge, header); gcc_assert (e == post_cond_edge); - /* Redirect split_edge to latch. */ - e = redirect_edge_and_branch (split_edge, latch); - gcc_assert (e == split_edge); + /* Redirect edge_at_split to latch. */ + e = redirect_edge_and_branch (edge_at_split, latch); + gcc_assert (e == edge_at_split); /* Set the new loop bound. */ gimple_cond_set_rhs (cond_stmt, bound); @@ -1697,21 +1703,36 @@ transform_to_exit_first_loop_alt (struct loop *loop, /* Set the latch arguments of the new phis to ivtmp/sum_b. */ flush_pending_stmts (post_inc_edge); - /* Register the reduction exit phis. */ + /* Create a new empty exit block, inbetween the new loop header and the old + exit block. The function separate_decls_in_region needs this block to + insert code that is active on loop exit, but not any other path. */ + basic_block new_exit_block = split_edge (exit); + + /* Insert and register the reduction exit phis. */ for (gphi_iterator gsi = gsi_start_phis (exit_block); !gsi_end_p (gsi); gsi_next (&gsi)) { gphi *phi = gsi.phi (); tree res_z = PHI_RESULT (phi); + + /* Now that we have a new exit block, duplicate the phi of the old exit + block in the new exit block to preserve loop-closed ssa. */ + edge succ_new_exit_block = single_succ_edge (new_exit_block); + edge pred_new_exit_block = single_pred_edge (new_exit_block); + tree res_y = copy_ssa_name (res_z, phi); + gphi *nphi = create_phi_node (res_y, new_exit_block); + tree res_c = PHI_ARG_DEF_FROM_EDGE (phi, succ_new_exit_block); + add_phi_arg (nphi, res_c, pred_new_exit_block, UNKNOWN_LOCATION); + add_phi_arg (phi, res_y, succ_new_exit_block, UNKNOWN_LOCATION); + if (virtual_operand_p (res_z)) continue; - tree res_c = PHI_ARG_DEF_FROM_EDGE (phi, exit); gimple reduc_phi = SSA_NAME_DEF_STMT (res_c); struct reduction_info *red = reduction_phi (reduction_list, reduc_phi); if (red != NULL) - red->keep_res = phi; + red->keep_res = nphi; } /* We're going to cancel the loop at the end of gen_parallel_loop, but until diff --git a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c index cb5bf9c..bf06e0e 100644 --- a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c +++ b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c @@ -36,5 +36,10 @@ main (void) if (res != 11995) abort (); + /* Test low iteration count case. */ + res = f (10); + if (res != 25) + abort (); + return 0; } diff --git a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c index 1c32ea3..35c47d7 100644 --- a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c +++ b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c @@ -21,8 +21,8 @@ f (unsigned int n) c[i] = a[i] + b[i]; } -int -main (void) +static void __attribute__((noclone,noinline)) +init (void) { int i, j; @@ -35,6 +35,14 @@ main (void) b[k] = (k * 3) % 7; c[k] = k * 2; } +} + +int +main (void) +{ + int i; + + init (); f (N); @@ -46,5 +54,21 @@ main (void) abort (); } + /* Test low iteration count case. */ + + init (); + + f (10); + + for (i = 0; i < N; i++) + { + unsigned int actual = c[i]; + unsigned int expected = (i < 10 + ? i + ((i * 3) % 7) + : i * 2); + if (actual != expected) + abort (); + } + return 0; } -- 1.9.1 --------------020709070100050909020807--