public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Bin.Cheng" <amker.cheng@gmail.com>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: bin.cheng@linux.alibaba.com,
	Richard Guenther <richard.guenther@gmail.com>,
		gcc-patches List <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count
Date: Fri, 07 Dec 2018 10:00:00 -0000	[thread overview]
Message-ID: <CAHFci2_=TFhPXFU48Y=xe1R6vVcv7neBkiC4+8AKMCgyrJb4Gw@mail.gmail.com> (raw)
In-Reply-To: <CAHFci2_R0TUyFmOgiKbQEzCyUNO_rhhcJBiohd5e-drDiX34Kw@mail.gmail.com>

On Tue, Dec 4, 2018 at 4:40 PM Bin.Cheng <amker.cheng@gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 12:20 AM Jan Hubicka <hubicka@ucw.cz> wrote:
> >
> > > On Tue, Nov 20, 2018 at 6:55 PM bin.cheng <bin.cheng@linux.alibaba.com> wrote:
> > > >
> > > > Sender:Jan Hubicka <hubicka@ucw.cz>
> > > > Sent at:2018 Nov 5 (Mon) 22:21
> > > > To:Richard Biener <richard.guenther@gmail.com>
> > > > Cc:bin.cheng <bin.cheng@linux.alibaba.com>; GCC Patches <gcc-patches@gcc.gnu.org>
> > > > Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count
> > > >
> > > > >
> > > > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > > In new profile probability/count infra, we have different precision quality categories,
> > > > > > > and probabilities/counts of different categories are not supposed to be compared or
> > > > > > > calculated.  Though in general is an improvement, it introduces unexpected behavior.
> > > > > > > Specifically, class profile_probablity and profile_count themselves are implemented
> > > > > > > by comparing probabilities/counts against profile_count::zero().  while zero() is of
> > > > > > > profile_precision category, it's always compared different to zero of other precision
> > > > > > > categories including afdo.
> > > > > > >
> > > > > > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless
> > > > > > > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison
> > > > > > > against probability_count::zero().  2) requires lots of code changes so I went with 1)
> > > > > > > in this patch set.  This patch doesn't handle "always" but it might be.
> > > > > > >
> > > > > > > This patch also corrects a minor issue where we try to invert an uninitialized value.
> > > > > > >
> > > > > > > Bootstrap and test on x86_64 in patch set.  Is it OK?
> > > > > >
> > > > > > I'll defer on the emit_store_flag_force change, likewise for the zero
> > > > > > handling in
> > > > > > compares - I don't think zeros of different qualities should compare equal.
> > > > > > Would compares against ::always() not have the very same issue?
> > > > > > Likewise ::even(),
> > > > > > ::likely(), etc.?  Those always get guessed quality.
> > > > > >
> > > > > > The invert change looks OK to me.  The related change to the always() API would
> > > > > > suggest to replace guessed_always() with always (guessed) and also do similar
> > > > > > changes throughout the whole API...
> > > > > >
> > > > > > Honza?
> > > > >
> > > > > The zeros are really differenct zeros.  profile_count::zero makes us to
> > > > > drop the basic block into cold section because we know that it won't be
> > > > > executed in normal run of program (either we have accurate profile
> > > > > feedback or by proving that the program is on way to crash or user
> > > > > annotated cold section).  Having guessed zero or auto-fdo zero won't
> > > > > make us to do such agressive size optimization.
> > > > > This is important since those zeros relatively commonly happens by
> > > > > accident and thus if we dropped all the code to cold section the cold
> > > > > section would be visited relativel often during execution of program
> > > > > which would eliminate its need.
> > > > >
> > > > > Most comparsion in profile-count.h which goes agains profile_count==zero
> > > > > are realy intended to pass only for this "aboslute zero". They bypass
> > > > > the precision adjusmtents which normally happen when you merge values
> > > > > of different precision.
> > > > >
> > > > > What kind of unexpected behaviour are you seeing?
> > > > > We already have nonzero_p which is what we use when we want to know that
> > > > > count is non-zero in some sense of precision.
> > > > Hi Honza,
> > > > Sorry for letting this slip away.  So in case of AutoFDO, due to the nature
> > > > of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo
> > > > precision, and we have checks against zero profile_count in precise precision
> > > > All these checks end up with false and cause issues.  Take the code in
> > > > update_profiling_info as an example:
> > > >
> > > > update_profiling_info (struct cgraph_node *orig_node,
> > > >                        struct cgraph_node *new_node)
> > > > {
> > > >    struct cgraph_edge *cs;
> > > >    struct caller_statistics stats;
> > > >    profile_count new_sum, orig_sum;
> > > >    profile_count remainder, orig_node_count = orig_node->count;
> > > >
> > > >    if (!(orig_node_count.ipa () > profile_count::zero ()))
> > > >      return;
> > > >    //...
> > > >    for (cs = new_node->callees; cs; cs = cs->next_callee)
> > > >      cs->count = cs->count.apply_scale (new_sum, orig_node_count);
> > > >
> > > > Since we also have below code in profile_count::operator>,
> > > >       if (other == profile_count::zero ())
> > > >         return !(*this == profile_count::zero ());
> > > >
> > > > If orig_node_count is afdo zero, the above zero check for orig_node_count
> > > > returns false, we end up with passing zero density to apply_scale issue and
> > > > asserting.
> > > >
> > > > In this updated patch, I restrcited changes only to profile_count::operator
> > > > <, >, <= and >=.  Plus, I think there is a latent typo in operator>= because
> > > > current code return TRUE if '*this' is precise zero and 'other' is precise
> > > > non-zero.
> > > > @@ -879,7 +879,7 @@ public:
> > > >        if (other == profile_count::zero ())
> > > >         return true;
> > > >        if (*this == profile_count::zero ())
> > > > -       return !(other == profile_count::zero ());
> > > > +       return !other.nonzero_p ();
> >
> > We already have
> >
> > True:
> >  profile_count::zero < any other value
> >  any other value > profile_count::zero
> >  profile_count::zero <= any initialized value
> >  profile_count::zero <= profile_count::zero
> >  any initialized value >= profile_count::zero
> >
> > false
> >  profile_count::zero > any other value
> >  any other value < profile_count::zero
> >
> > You are right about typo in >=, it should be:
> >
> > Index: profile-count.h
> > ===================================================================
> > --- profile-count.h     (revision 266450)
> > +++ profile-count.h     (working copy)
> > @@ -879,7 +879,7 @@
> >        if (other == profile_count::zero ())
> >         return true;
> >        if (*this == profile_count::zero ())
> > -       return !(other == profile_count::zero ());
> > +       return other == profile_count::zero ();
> >        gcc_checking_assert (compatible_p (other));
> >        return m_val >= other.m_val;
> >      }
> >
> > With your patch we get false for:
> >   profile_count::zero < guessed/auto_fdo/other 0
> >   guessed/auto_fdo/other > profile_count::zero
> >   guessed/auto_fdo/other <= profile_count::zero
> >   profile_count::zero >= profile_count::zero
> >
> > The original idea was to intentionally make profile_count::zero smaller
> > than any toher types of initialized values, since it is more strict hint
> > that the path will not be taken.
> > For example in bb_reorder if you end up with "funny" profile with two
> > exit edges one having profile_count::zero and other being zero as result
> > of (unsucesfull) profile updates it is still better idea to pick the
> > profile_count::zero for taken edge.  With your patch it will end up
> > picking either of the paths.
> >
> > How the patch helps to your situation?
> Hi Honza, thanks very much for elaborating.  Issue in case of autofdo
> is as described in last message:
> Given update_profiling_info implemented as below:
>
> update_profiling_info (struct cgraph_node *orig_node,
>                        struct cgraph_node *new_node)
> {
>    struct cgraph_edge *cs;
>    struct caller_statistics stats;
>    profile_count new_sum, orig_sum;
>    profile_count remainder, orig_node_count = orig_node->count;
>
>    //*****Operator ">" returns true if orig_node_count == autofdo.zero.
>    if (!(orig_node_count.ipa () > profile_count::zero ()))
>      return;
>    //...
>    for (cs = new_node->callees; cs; cs = cs->next_callee)
>      //*****Result in apply_scale called with autofdo.zero as the 2nd argument.
>      cs->count = cs->count.apply_scale (new_sum, orig_node_count);
>
> Also apply_scale is implemented as:
>   profile_count apply_scale (profile_count num, profile_count den) const
>     {
>       if (*this == profile_count::zero ())
>         return *this;
>       if (num == profile_count::zero ())
>         return num;
>       if (!initialized_p () || !num.initialized_p () || !den.initialized_p ())
>         return profile_count::uninitialized ();
>       if (num == den)
>         return *this;
>       gcc_checking_assert (den.m_val);
>
> Here we have (num != zero && den == autofdo.zero), it triggers the
> gcc_checking_assert.
> According to your explanation, guess we need to call force_nonzero for
> orig_node_count before calling apply_scale, right?

Hi Honza,
I have committed the typo fix as revision 266885.
Also I followed your suggestion (IIUC) by calling
profile_count::adjust_for_ipa_scaling for zero den in function
update_profiling_info.  It works and does make more sense than
changing the global zero check logic.
Patch tested as before, is it ok?

Thanks,
bin

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 4471bae11c7..5074ef63da1 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3715,9 +3715,11 @@ update_profiling_info (struct cgraph_node *orig_node,
   new_sum = orig_node_count.combine_with_ipa_count (new_sum);
   orig_node->count = remainder;

+  profile_count::adjust_for_ipa_scaling (&new_sum, &orig_node_count);
   for (cs = new_node->callees; cs; cs = cs->next_callee)
     cs->count = cs->count.apply_scale (new_sum, orig_node_count);

+  profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count);
   for (cs = orig_node->callees; cs; cs = cs->next_callee)
     cs->count = cs->count.apply_scale (remainder, orig_node_count);

2018-12-07  Bin Cheng  <bin.cheng@linux.alibaba.com>

        * ipa-cp.c (update_profiling_info): Call adjust_for_ipa_scaling for
        zero profile count.

  reply	other threads:[~2018-12-07 10:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-31  8:33 bin.cheng
2018-10-31  9:43 ` Richard Biener
2018-10-31  9:57   ` Bin.Cheng
2018-11-02  5:31   ` bin.cheng
2018-11-05 14:38     ` Jan Hubicka
2018-11-05 14:40     ` Jan Hubicka
2018-11-13  6:58       ` Bin.Cheng
     [not found]   ` <20181105141206.4ncu3s2v2jxv6o54@kam.mff.cuni.cz>
2018-11-20 10:54     ` bin.cheng
     [not found]       ` <CAHFci28CQB3KK+Yp7gb8BR61UaGhAJJ-R1yzZPHxitczvgEB3w@mail.gmail.com>
2018-11-28 16:20         ` Jan Hubicka
2018-12-04  8:40           ` Bin.Cheng
2018-12-07 10:00             ` Bin.Cheng [this message]
2018-12-07 16:57               ` Jan Hubicka
2018-12-09  6:40                 ` Bin.Cheng
2018-10-31 15:02 ` Jeff Law
2018-11-01  1:11   ` Bin.Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHFci2_=TFhPXFU48Y=xe1R6vVcv7neBkiC4+8AKMCgyrJb4Gw@mail.gmail.com' \
    --to=amker.cheng@gmail.com \
    --cc=bin.cheng@linux.alibaba.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).