Re: [PATCH PR77536]Generate correct profiling information for vectorized loop

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Bin.Cheng" <amker.cheng@gmail.com>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: Richard Biener <richard.guenther@gmail.com>,
	Bin Cheng <Bin.Cheng@arm.com>,
		"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>,
		"pthaugen@linux.vnet.ibm.com" <pthaugen@linux.vnet.ibm.com>
Subject: Re: [PATCH PR77536]Generate correct profiling information for vectorized loop
Date: Mon, 20 Feb 2017 15:16:00 -0000	[thread overview]
Message-ID: <CAHFci2_Mq7b_-7cwxdmwvGWojgyeTg2qApedRKTFMgPcJvU-hw@mail.gmail.com> (raw)
In-Reply-To: <20170220140210.GA2932@kam.mff.cuni.cz>

On Mon, Feb 20, 2017 at 2:02 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> > 2017-02-16  Bin Cheng  <bin.cheng@arm.com>
>> >
>> >         PR tree-optimization/77536
>> >         * tree-ssa-loop-manip.c (niter_for_unrolled_loop): New function.
>> >         (tree_transform_and_unroll_loop): Use above function to compute the
>> >         estimated niter of unrolled loop.
>> >         * tree-ssa-loop-manip.h niter_for_unrolled_loop(): New declaration.
>> >         * tree-vect-loop.c (scale_profile_for_vect_loop): New function.
>> >         (vect_transform_loop): Call above function.
Thanks very much for your suggestions.  I don't know profiling logic
very well and have some questions embedded below before I start
revising patch.
>
> +/* Return estimated niter for LOOP after unrolling by FACTOR times.  */
> +
> +unsigned
> +niter_for_unrolled_loop (struct loop *loop, unsigned factor)
> +{
> +  unsigned est_niter = expected_loop_iterations (loop);
>
> What happens when you have profile and loop iterates very many times?
> Perhaps we want to do all calculation in gcov_type and use
> expected_loop_iterations_unbounded>?
>
> expected_loop_iterations is capping by 10000 that is easy to overflow.
>
> +  gcc_assert (factor != 0);
> +  unsigned new_est_niter = est_niter / factor;
> +
> +  /* Without profile feedback, loops for which we do not know a better estimate
> +     are assumed to roll 10 times.  When we unroll such loop, it appears to
> +     roll too little, and it may even seem to be cold.  To avoid this, we
> +     ensure that the created loop appears to roll at least 5 times (but at
> +     most as many times as before unrolling).  */
> +  if (new_est_niter < 5)
> +    {
> +      if (est_niter < 5)
> +       new_est_niter = est_niter;
> +      else
> +       new_est_niter = 5;
> +    }
> +
> +  return new_est_niter;
> +}
>
> I see this code is pre-existing, but please extend it to test if
> loop->header->count is non-zero.  Even if we do not have idea about loop
> iteration count estimate we may end up predicting more than 10 iterations when
> predictors combine that way.
If I use expected_loop_iterations_unbounded, then do I need to handle
loop->header->count explicitly here?  I suppose not because it has
below code already:

  /* If we have no profile at all, use AVG_LOOP_NITER.  */
  if (profile_status_for_fn (cfun) == PROFILE_ABSENT)
    expected = PARAM_VALUE (PARAM_AVG_LOOP_NITER);
  else if (loop->latch && (loop->latch->count || loop->header->count))
    {
      gcov_type count_in, count_latch;
      //...

The second question is: looks like it only takes latch->count into
consideration when PROFILE_ABSENT.  But according to your comments, we
could have nonzero count sometime?

>
> Perhaps testing estimated-loop_iterations would also make sense, but that
> could be dealt with incrementally.
>
> +static void
> +scale_profile_for_vect_loop (struct loop *loop, unsigned vf)
> +{
> +  unsigned freq_h = loop->header->frequency;
> +  unsigned freq_e = EDGE_FREQUENCY (loop_preheader_edge (loop));
> +  /* Reduce loop iterations by the vectorization factor.  */
> +  unsigned new_est_niter = niter_for_unrolled_loop (loop, vf);
> +
> +  if (freq_h != 0)
> +    scale_loop_frequencies (loop, freq_e * (new_est_niter + 1), freq_h);
> +
> I am always trying to avoid propagating small mistakes (i.e. frong freq_h or
> freq_h) into bigger mistakes (i.e. wrong profile of the whole loop) to avoid
> spreading mistakes across cfg.
>
> But I guess here it is sort of safe because vectorized loops are simple.
> You can't just scale down the existing counts/frequencies by vf, because the
> entry edge frequency was adjusted.
I am not 100% follow here, it looks the code avoids changing frequency
counter for preheader/exit edge, otherwise we would need to change all
counters dominated by them?
>
> Also niter_for_unrolled_loop depends on sanity of the profile, so perhaps you
> need to compute it before you start chanigng the CFG by peeling proplogue?
Peeling for prologue doesn't change profiling information of
vect_loop, it is the skip edge from before loop to preferred epilogue
loop that will change profile counters.  I guess here exists a dilemma
that niter_for_unrolled_loop is for loop after peeling for prologue?

Thanks,
bin
>
> Finally if freq_e is 0, all frequencies and counts will be probably dropped to
> 0.  What about determining fraction by counts if they are available?
>
> Otherwise the patch looks good and thanks a lot for working on this!
>
> Honza
>
>> >
>> > gcc/testsuite/ChangeLog
>> > 2017-02-16  Bin Cheng  <bin.cheng@arm.com>
>> >
>> >         PR tree-optimization/77536
>> >         * gcc.dg/vect/pr79347.c: Revise testing string.

next prev parent reply	other threads:[~2017-02-20 15:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-16 18:38 Bin Cheng
2017-02-17  1:39 ` Pat Haugen
2017-02-20 12:54 ` Richard Biener
2017-02-20 14:21   ` Jan Hubicka
2017-02-20 15:16     ` Bin.Cheng [this message]
2017-02-20 15:44       ` Jan Hubicka
2017-02-20 16:05         ` Bin.Cheng
2017-02-20 17:02           ` Jan Hubicka
2017-02-20 17:53             ` Bin.Cheng
2017-02-21 14:48             ` Bin.Cheng
2017-02-21 15:52               ` Jan Hubicka
2017-02-22 12:23                 ` Bin.Cheng
2017-02-22 14:59                   ` Jan Hubicka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHFci2_Mq7b_-7cwxdmwvGWojgyeTg2qApedRKTFMgPcJvU-hw@mail.gmail.com \
    --to=amker.cheng@gmail.com \
    --cc=Bin.Cheng@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=nd@arm.com \
    --cc=pthaugen@linux.vnet.ibm.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).