public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH GCC 1/9]Delete useless code in tree-vect-loop-manip.c
@ 2016-09-06 18:50 Bin Cheng
  2016-09-07 12:20 ` Jeff Law
  0 siblings, 1 reply; 2+ messages in thread
From: Bin Cheng @ 2016-09-06 18:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 2346 bytes --]

Hi,
This is a patch set generating new control flow graph for vectorized loop and its peeling loops.  At the moment, CFG for vecorized loop is complicated and sub-optimal.  Major issues are like:
A) For both prologue and vectorized loop, it generates guard/branch before loops checking if the following (prologue/vectorized) loop should be skipped.  It also generates guard/branch after loops checking if the next loop (vectorized/epilogue) loop should be skipped.
B) Depending on how conditional set is supported by targets, it may generates one additional if-statement (branch) setting the niters for prologue loop.
C) In the worst cases, up to 4 branch instructions need to be executed before vectorized loop is entered.
D) For loops without enough niters, it checks&executes some (niters_prologue) iterations with prologue loop; then checks if the rest number of iterations (niters - niters_prologue) is enough for vectorization; if not, it skips vectorized loop and continues with epilogue loop.  This is bad since vectorized loop won't be executed at all after all the hassle.

This patch set improves it by merging different checks thus only 2 branch instructions (could be further reduced in combination with loop versioning) are executed before vectorized loop; it does better in compile time analysis in order to avoid prologue/epilogue peeling if possible; it improves code generation in various ways (live overflow handling, generating short live ranges).  In terms of implementation, it tries to factor SSA updating code out of CFG changing code, I think this may help future work replacing slpeel_* with generic GIMPLE loop copier.

So far there are 9 patches in the set, patch [1-5] are small prerequisites for major change which is done by patch 6.  Patch [7-9] are small patches either address test case or improve code generation.  Final bootstrap and test of patch set ongoing on x86_64 and AArch64.  Assume no new failure or will be fixed, any comments on this?

This is the first patch deleting useless code in tree-vect-loop-manip.c, as well as fixing obvious code style issue.

Thanks,
bin

2016-09-01  Bin Cheng  <bin.cheng@arm.com>

	* tree-vect-loop-manip.c (slpeel_can_duplicate_loop_p): Fix code
	style issue.
	(vect_do_peeling_for_loop_bound, vect_do_peeling_for_alignment):
	Remove useless code.

[-- Attachment #2: 001-useless-code-and-format-20160901.txt --]
[-- Type: text/plain, Size: 2029 bytes --]

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 01d6bb1..3a3b0bc 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -1003,9 +1003,9 @@ slpeel_can_duplicate_loop_p (const struct loop *loop, const_edge e)
   gimple_stmt_iterator loop_exit_gsi = gsi_last_bb (exit_e->src);
   unsigned int num_bb = loop->inner? 5 : 2;
 
-      /* All loops have an outer scope; the only case loop->outer is NULL is for
-         the function itself.  */
-      if (!loop_outer (loop)
+  /* All loops have an outer scope; the only case loop->outer is NULL is for
+     the function itself.  */
+  if (!loop_outer (loop)
       || loop->num_nodes != num_bb
       || !empty_block_p (loop->latch)
       || !single_exit (loop)
@@ -1786,7 +1786,6 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo,
   struct loop *new_loop;
   edge update_e;
   basic_block preheader;
-  int loop_num;
   int max_iter;
   tree cond_expr = NULL_TREE;
   gimple_seq cond_expr_stmt_list = NULL;
@@ -1797,8 +1796,6 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo,
 
   initialize_original_copy_tables ();
 
-  loop_num  = loop->num;
-
   new_loop
     = slpeel_tree_peel_loop_to_edge (loop, scalar_loop, single_exit (loop),
 				     &ratio_mult_vf_name, ni_name, false,
@@ -1806,7 +1803,6 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo,
 				     cond_expr, cond_expr_stmt_list,
 				     0, LOOP_VINFO_VECT_FACTOR (loop_vinfo));
   gcc_assert (new_loop);
-  gcc_assert (loop_num == loop->num);
   slpeel_checking_verify_cfg_after_peeling (loop, new_loop);
 
   /* A guard that controls whether the new_loop is to be executed or skipped
@@ -2053,8 +2049,6 @@ vect_do_peeling_for_alignment (loop_vec_info loop_vinfo, tree ni_name,
 
   initialize_original_copy_tables ();
 
-  gimple_seq stmts = NULL;
-  gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop), stmts);
   niters_of_prolog_loop = vect_gen_niters_for_prolog_loop (loop_vinfo,
 							   ni_name,
 							   &bound);

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH GCC 1/9]Delete useless code in tree-vect-loop-manip.c
  2016-09-06 18:50 [PATCH GCC 1/9]Delete useless code in tree-vect-loop-manip.c Bin Cheng
@ 2016-09-07 12:20 ` Jeff Law
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2016-09-07 12:20 UTC (permalink / raw)
  To: Bin Cheng, gcc-patches; +Cc: nd

On 09/06/2016 12:49 PM, Bin Cheng wrote:
> Hi,
> This is a patch set generating new control flow graph for vectorized loop and its peeling loops.  At the moment, CFG for vecorized loop is complicated and sub-optimal.  Major issues are like:
> A) For both prologue and vectorized loop, it generates guard/branch before loops checking if the following (prologue/vectorized) loop should be skipped.  It also generates guard/branch after loops checking if the next loop (vectorized/epilogue) loop should be skipped.
> B) Depending on how conditional set is supported by targets, it may generates one additional if-statement (branch) setting the niters for prologue loop.
> C) In the worst cases, up to 4 branch instructions need to be executed before vectorized loop is entered.
> D) For loops without enough niters, it checks&executes some (niters_prologue) iterations with prologue loop; then checks if the rest number of iterations (niters - niters_prologue) is enough for vectorization; if not, it skips vectorized loop and continues with epilogue loop.  This is bad since vectorized loop won't be executed at all after all the hassle.
>
> This patch set improves it by merging different checks thus only 2 branch instructions (could be further reduced in combination with loop versioning) are executed before vectorized loop; it does better in compile time analysis in order to avoid prologue/epilogue peeling if possible; it improves code generation in various ways (live overflow handling, generating short live ranges).  In terms of implementation, it tries to factor SSA updating code out of CFG changing code, I think this may help future work replacing slpeel_* with generic GIMPLE loop copier.
>
> So far there are 9 patches in the set, patch [1-5] are small prerequisites for major change which is done by patch 6.  Patch [7-9] are small patches either address test case or improve code generation.  Final bootstrap and test of patch set ongoing on x86_64 and AArch64.  Assume no new failure or will be fixed, any comments on this?
>
> This is the first patch deleting useless code in tree-vect-loop-manip.c, as well as fixing obvious code style issue.
>
> Thanks,
> bin
>
> 2016-09-01  Bin Cheng  <bin.cheng@arm.com>
>
> 	* tree-vect-loop-manip.c (slpeel_can_duplicate_loop_p): Fix code
> 	style issue.
> 	(vect_do_peeling_for_loop_bound, vect_do_peeling_for_alignment):
> 	Remove useless code.
Seems obvious to me -- I can't think of any reason why we'd emit a NULL 
sequence to the loop preheader edge.

jeff

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-09-07 12:18 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-06 18:50 [PATCH GCC 1/9]Delete useless code in tree-vect-loop-manip.c Bin Cheng
2016-09-07 12:20 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).