public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: Changbin Du <changbin.du@huawei.com>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: Richard Biener <richard.guenther@gmail.com>,
	Changbin Du <changbin.du@huawei.com>, <gcc@gcc.gnu.org>,
	<gcc-bugs@gcc.gnu.org>, Ning Jia <ning.jia@huawei.com>,
	Li Yu <marvin.tms@huawei.com>, Wang Nan <wangnan0@huawei.com>,
	Hui Wang <hw.huiwang@huawei.com>
Subject: Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2
Date: Tue, 1 Aug 2023 20:31:15 +0800	[thread overview]
Message-ID: <20230801123115.yqh3yl4imgp3xfwh@M910t> (raw)
In-Reply-To: <ZMjF0jYKFMb2DVPP@kam.mff.cuni.cz>

On Tue, Aug 01, 2023 at 10:44:02AM +0200, Jan Hubicka wrote:
> > > If I comment it out as above patch, then O3/PGO can get 16% and 12% performance
> > > improvement compared to O2 on x86.
> > >
> > >                         O2              O3              PGO
> > > cycles                  2,497,674,824   2,104,993,224   2,199,753,593
> > > instructions            10,457,508,646  9,723,056,131   10,457,216,225
> > > branches                2,303,029,380   2,250,522,323   2,302,994,942
> > > branch-misses           0.00%           0.01%           0.01%
> > >
> > > The main difference in the compilation output about code around the miss-prediction
> > > branch is:
> > >   o In O2: predicated instruction (cmov here) is selected to eliminate above
> > >     branch. cmov is true better than branch here.
> > >   o In O3/PGO: bitout() is inlined into encode_file(), and branch instruction
> > >     is selected. But this branch is obviously *unpredictable* and the compiler
> > >     doesn't know it. This why O3/PGO are are so bad for this program.
> > >
> > > Gcc doesn't support __builtin_unpredictable() which has been introduced by llvm.
> > > Then I tried to see if __builtin_expect_with_probability(e,x, 0.5) can serve the
> > > same purpose. The result is negative.
> > 
> > But does it appear to be predictable with your profiling data?
> 
> Also one thing is that __builtin_expect and
> __builtin_expect_with_probability only affects the static branch
> prediciton algorithm, so with profile feedback they are ignored on every
> branch executed at least once during the train run.
> 
> setting probability 0.5 is really not exactly the same as hint that the
> branch will be mispredicted, since modern CPUs handle well regularly
> behaving branchs (such as a branch firing every even iteration of loop).
>
Yeah. Setting probability 0.5 is just an experimental attempt. I don't know
how the heuristic works internally.

> So I think having the builting is not a bad idea.  I was thinking if it
> makes sense to represent it withing profile_probability type and I am
> not convinced, since "unpredictable probability" sounds counceptually
> odd and we would need to keep the flag intact over all probability
> updates we do.  For things like loop exits we recompute probabilities
> from frequencies after unrolling/vectorizaiton and other things and we
> would need to invent new API to propagate the flag from previous
> probability (which is not even part of the computation right now)
> 
> So I guess the challenge is how to pass this info down through the
> optimization pipeline, since we would need to annotate gimple
> conds/switches and manage it to RTL level.  On gimple we have flags and
> on rtl level notes so there is space for it, but we would need to
> maintain the info through CFG changes.
> 
> Auto-FDO may be interesting way to detect such branches.
> 
So I suppose PGO also could. But branch instruction is selected in my test just
as O3 does. And data shows that comv works better than branch here.

> Honza
> > 
> > > I think we could come to a conclusion that there must be something can improve in
> > > Gcc's heuristic strategy about Predicated Instructions and branches, at least
> > > for O3 and PGO.
> > >
> > > And can we add __builtin_unpredictable() support for Gcc? As usually it's hard
> > > for the compiler to detect unpredictable branches.
> > >
> > > --
> > > Cheers,
> > > Changbin Du

-- 
Cheers,
Changbin Du

  reply	other threads:[~2023-08-01 12:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-31 12:55 Changbin Du
2023-07-31 13:53 ` Richard Biener
2023-08-01  8:44   ` Jan Hubicka
2023-08-01 12:31     ` Changbin Du [this message]
2023-08-01 12:21   ` Changbin Du
2023-08-01 12:45 ` Changbin Du
2023-08-11 13:37 ` Changbin Du

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230801123115.yqh3yl4imgp3xfwh@M910t \
    --to=changbin.du@huawei.com \
    --cc=gcc-bugs@gcc.gnu.org \
    --cc=gcc@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=hw.huiwang@huawei.com \
    --cc=marvin.tms@huawei.com \
    --cc=ning.jia@huawei.com \
    --cc=richard.guenther@gmail.com \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).