public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <law@redhat.com>
To: Ilya Enkovich <enkovich.gnu@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH, vec-tails 07/10] Support loop epilogue combining
Date: Thu, 16 Jun 2016 16:54:00 -0000	[thread overview]
Message-ID: <18ccae1a-30c3-c23c-e28f-287f9d41eaa0@redhat.com> (raw)
In-Reply-To: <20160519194450.GH40563@msticlxl57.ims.intel.com>

On 05/19/2016 01:44 PM, Ilya Enkovich wrote:
> Hi,
>
> This patch introduces support for loop epilogue combining.  This includes
> support in cost estimation and all required changes required to mask
> vectorized loop.
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2016-05-19  Ilya Enkovich  <ilya.enkovich@intel.com>
>
> 	* dbgcnt.def (vect_tail_combine): New.
> 	* params.def (PARAM_VECT_COST_INCREASE_COMBINE_THRESHOLD): New.
> 	* tree-vect-data-refs.c (vect_get_new_ssa_name): Support vect_mask_var.
> 	* tree-vect-loop-manip.c (slpeel_tree_peel_loop_to_edge): Support
> 	epilogue combined with loop body.
> 	(vect_do_peeling_for_loop_bound): Likewise.
> 	* tree-vect-loop.c Include alias.h and dbgcnt.h.
> 	(vect_estimate_min_profitable_iters): Add ret_min_profitable_combine_niters
> 	arg, compute number of iterations for which loop epilogue combining is
> 	profitable.
> 	(vect_generate_tmps_on_preheader): Support combined apilogue.
> 	(vect_gen_ivs_for_masking): New.
> 	(vect_get_mask_index_for_elems): New.
> 	(vect_get_mask_index_for_type): New.
> 	(vect_gen_loop_masks): New.
> 	(vect_mask_reduction_stmt): New.
> 	(vect_mask_mask_load_store_stmt): New.
> 	(vect_mask_load_store_stmt): New.
> 	(vect_combine_loop_epilogue): New.
> 	(vect_transform_loop): Support combined apilogue.
>
>
> diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
> index fab5879..b3c0668 100644
> --- a/gcc/tree-vect-loop-manip.c
> +++ b/gcc/tree-vect-loop-manip.c
> @@ -1464,11 +1469,20 @@ slpeel_tree_peel_loop_to_edge (struct loop *loop, struct loop *scalar_loop,
>    bb_between_loops = new_exit_bb;
>    bb_after_second_loop = split_edge (single_exit (second_loop));
>
> -  pre_condition =
> -	fold_build2 (EQ_EXPR, boolean_type_node, *first_niters, niters);
> -  skip_e = slpeel_add_loop_guard (bb_between_loops, pre_condition, NULL,
> -                                  bb_after_second_loop, bb_before_first_loop,
> -				  inverse_probability (second_guard_probability));
> +  if (skip_second_after_first)
> +    /* We can just redirect edge from bb_between_loops to
> +       bb_after_second_loop but we have many code assuming
> +       we have a guard after the first loop.  So just make
> +       always taken condtion.  */
> +    pre_condition = fold_build2 (EQ_EXPR, boolean_type_node, integer_zero_node,
> +				 integer_zero_node);
This isn't ideal, but I don't think it's that big of an issue.

> @@ -1758,8 +1772,10 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo,
>    basic_block preheader;
>    int loop_num;
>    int max_iter;
> +  int bound2;
>    tree cond_expr = NULL_TREE;
>    gimple_seq cond_expr_stmt_list = NULL;
> +  bool combine = LOOP_VINFO_COMBINE_EPILOGUE (loop_vinfo);
>
>    if (dump_enabled_p ())
>      dump_printf_loc (MSG_NOTE, vect_location,
> @@ -1769,12 +1785,13 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo,
>
>    loop_num  = loop->num;
>
> +  bound2 = combine ? th : LOOP_VINFO_VECT_FACTOR (loop_vinfo);
Can you document what the TH parameter is to the various routines that 
use it in tree-vect-loop-manip.c?  I realize you didn't add it, but it 
would help anyone looking at this code in the future to know it's the 
threshold of iterations for vectorization without having to find it in 
other function comment headers ;-)

That's pre-approved to go in immediately :-)

> @@ -1803,7 +1820,11 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo,
>    max_iter = (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
>  	      ? LOOP_VINFO_VECT_FACTOR (loop_vinfo) * 2
>  	      : LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 2;
> -  if (check_profitability)
> +  /* When epilogue is combined only profitability
> +     treshold matters.  */
s/treshold/threshold/



>  static void
>  vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>  				    int *ret_min_profitable_niters,
> -				    int *ret_min_profitable_estimate)
> +				    int *ret_min_profitable_estimate,
> +				    int *ret_min_profitable_combine_niters)
I'm torn a bit here.  There's all kinds of things missing/incomplete in 
the function comments throughout the vectorizer.  And in some cases, 
like this one, the parameters are largely self-documenting.  But we've 
also got coding standards that we'd like to adhere to.

I don't think it's fair to require you to fix all these issues in the 
vectorizer (though if you wanted to, I'd fully support those an 
independent cleanups).

Perhaps just document LOOP_VINFO with a generic comment about the ret_* 
parameters for this function rather than a comment for each ret_* 
parameter.  Pre-approved for the trunk independent of the vec-tails work.


> @@ -3728,6 +3784,77 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>  		     min_profitable_estimate);
>
> +
> +      unsigned combine_treshold
> +	= PARAM_VALUE (PARAM_VECT_COST_INCREASE_COMBINE_THRESHOLD);
> +      /* Calculate profitability combining epilogue with the main loop.
> +	 We have a threshold for inside cost overhead (not applied
> +	 for low trip count loop case):
> +	 MIC * 100 < VIC * CT
> +	 Masked iteration should be better than a scalar prologue:
> +	 MIC + VIC < SIC * epilogue_niters  */
Can you double-check the whitespace formatting here.  Where does the 
"100" come from and should it be a param?


> @@ -6886,6 +7030,485 @@ vect_generate_tmps_on_preheader (loop_vec_info loop_vinfo,
>    return;
>  }
>

> +
> +/* Function vect_gen_loop_masks.
> +
> +   Create masks to mask a loop desvribed by LOOP_VINFO.  Masks
s/desvribed/described/

> +   are created according to LOOP_VINFO_REQUIRED_MASKS and are stored
> +   into MASKS vector.
> +
> +   Index of a mask in a vector is computed according to a number
> +   of masks's elements.  Masks are sorted by number of its elements
> +   in descending order.  Index 0 is used to access a mask with
> +   current_vector_size elements.  Among masks with the same number
> +   of elements the one with lower index is used to mask iterations
> +   with smaller iteration counter.  Note that you may get NULL elements
> +   for masks which are not required.  Use vect_get_mask_index_for_elems
> +   or vect_get_mask_index_for_type to access resulting vector.  */
> +
> +static void
> +vect_gen_loop_masks (loop_vec_info loop_vinfo, vec<tree> *masks)
> +{
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> +  edge pe = loop_preheader_edge (loop);
> +  tree niters = LOOP_VINFO_NITERS (loop_vinfo);
> +  unsigned min_mask_elems, max_mask_elems, nmasks;
> +  unsigned iv_elems, cur_mask, prev_mask, cur_mask_elems;
> +  auto_vec<tree> ivs;
> +  tree vectype, mask_type;
> +  tree vec_niters, vec_niters_val, mask;
> +  gimple *stmt;
> +  basic_block bb;
> +  gimple_stmt_iterator gsi = gsi_after_labels (loop->header);
> +  unsigned vec_size;
> +
> +  /* Create required IVs.  */
> +  vect_gen_ivs_for_masking (loop_vinfo, &ivs);
> +  vectype = TREE_TYPE (ivs[0]);
> +
> +  vec_size = tree_to_uhwi (TYPE_SIZE_UNIT (vectype));
> +  iv_elems = TYPE_VECTOR_SUBPARTS (vectype);
> +
> +  /* Get a proper niter to build a vector.  */
> +  if (!is_gimple_val (niters))
> +    {
> +      gimple_seq seq = NULL;
> +      niters = force_gimple_operand (niters, &seq, true, NULL);
> +      gsi_insert_seq_on_edge_immediate (pe, seq);
> +    }
> +  /* We may need a type cast in case niter has a too small type
> +     for generated IVs.  */
Nit.  There should be vertical whitespace after the close brace and the 
comment for the next logical block of code.  Can you do a scan over the 
patchkit looking for other instances where the vertical whitespace is 
needed.

Generally, if you find that a blob of code needs a comment, then the 
comment and blob of code should have that vertical whitespace to 
visually separate it from everything else.



> +/* Function vect_combine_loop_epilogue.
> +
> +   Combine loop epilogue with the main vectorized body.  It requires
> +   masking of memory accesses and reductions.  */
So you mask reductions, loads & stores.  Is there anything else that we 
might potentially need to mask to combine the loop & epilogue via masking?


I don't see anything particularly worrisome here either -- I have a 
slight concern about correctness issues with only masking loads/stores 
and reductions.  But I will defer to your judgment on whether or not 
there's other stuff that we need to mask to combine the epilogue with 
the loop via masking.

Jeff

  parent reply	other threads:[~2016-06-16 16:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-19 19:46 Ilya Enkovich
2016-06-15 11:44 ` Richard Biener
2016-06-16 15:41   ` Ilya Enkovich
2016-06-16 15:51     ` Jeff Law
2016-06-16 16:03       ` Ilya Enkovich
2016-06-16 16:54 ` Jeff Law [this message]
2016-06-28 13:37   ` Ilya Enkovich
2016-06-28 14:16     ` Ilya Enkovich
2016-07-11 13:39     ` Ilya Enkovich
2016-07-14 22:04     ` Jeff Law
2016-07-20 14:40       ` Ilya Enkovich
2016-07-20 16:24         ` Jeff Law
2016-07-21  9:15           ` Ilya Enkovich
2016-07-21 16:34             ` Jeff Law
2016-07-22 11:36               ` Richard Biener
2016-07-25 18:01                 ` Jeff Law
2016-07-25 18:33                   ` Richard Biener
2016-07-25 21:08                     ` Jeff Law
2016-07-26  9:57                       ` Ilya Enkovich
2016-07-26 11:51                         ` Richard Biener
2016-07-26 13:03                           ` Ilya Enkovich
2016-07-26 13:05                             ` Richard Biener
2016-07-26 15:26                         ` Jeff Law
2016-07-26 15:38                           ` Ilya Enkovich
2016-08-01  9:09                             ` Ilya Enkovich
2016-08-01 16:10                               ` Jeff Law
2016-09-02 14:46                                 ` Yuri Rumyantsev
2016-09-02 16:33                                   ` Bin.Cheng
2016-09-05  7:39                                   ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18ccae1a-30c3-c23c-e28f-287f9d41eaa0@redhat.com \
    --to=law@redhat.com \
    --cc=enkovich.gnu@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).