* Fix optimize_mask_stores profile update
@ 2023-07-17 10:35 Jan Hubicka
2023-07-17 10:49 ` Richard Biener
0 siblings, 1 reply; 6+ messages in thread
From: Jan Hubicka @ 2023-07-17 10:35 UTC (permalink / raw)
To: gcc-patches, rguenther
Hi,
While looking into sphinx3 regression I noticed that vectorizer produces
BBs with overall probability count 120%. This patch fixes it.
Richi, I don't know how to create a testcase, but having one would
be nice.
Bootstrapped/regtested x86_64-linux, commited last night (sorry for
late email)
gcc/ChangeLog:
PR tree-optimization/110649
* tree-vect-loop.cc (optimize_mask_stores): Set correctly
probability of the if-then-else construct.
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 7d917bfd72c..b44fb9c7712 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop)
efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
/* Put STORE_BB to likely part. */
efalse->probability = profile_probability::unlikely ();
+ e->probability = efalse->probability.invert ();
store_bb->count = efalse->count ();
make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
if (dom_info_available_p (CDI_DOMINATORS))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fix optimize_mask_stores profile update
2023-07-17 10:35 Fix optimize_mask_stores profile update Jan Hubicka
@ 2023-07-17 10:49 ` Richard Biener
2023-07-17 12:38 ` Jan Hubicka
2023-07-21 17:34 ` Jan Hubicka
0 siblings, 2 replies; 6+ messages in thread
From: Richard Biener @ 2023-07-17 10:49 UTC (permalink / raw)
To: Jan Hubicka; +Cc: gcc-patches, rguenther
On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
> While looking into sphinx3 regression I noticed that vectorizer produces
> BBs with overall probability count 120%. This patch fixes it.
> Richi, I don't know how to create a testcase, but having one would
> be nice.
>
> Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> late email)
This should trigger with sth like
for (i)
if (cond[i])
out[i] = 1.;
so a masked store and then using AVX2+. ISTR we disable AVX masked
stores on zen (but not AVX512).
Richard.
> gcc/ChangeLog:
>
> PR tree-optimization/110649
> * tree-vect-loop.cc (optimize_mask_stores): Set correctly
> probability of the if-then-else construct.
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 7d917bfd72c..b44fb9c7712 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop)
> efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
> /* Put STORE_BB to likely part. */
> efalse->probability = profile_probability::unlikely ();
> + e->probability = efalse->probability.invert ();
> store_bb->count = efalse->count ();
isn't the count also wrong? Or rather efalse should be likely(). We're
testing doing
if (!mask all zeros)
masked-store
because a masked store with all zero mask can end up invoking COW page fault
handling multiple times (because it doesn't actually write).
Note -Ofast allows store data races and thus does RMW instead of a masked store.
> make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
> if (dom_info_available_p (CDI_DOMINATORS))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fix optimize_mask_stores profile update
2023-07-17 10:49 ` Richard Biener
@ 2023-07-17 12:38 ` Jan Hubicka
2023-07-17 13:26 ` Richard Biener
2023-07-21 17:34 ` Jan Hubicka
1 sibling, 1 reply; 6+ messages in thread
From: Jan Hubicka @ 2023-07-17 12:38 UTC (permalink / raw)
To: Richard Biener, gcc-patches
> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi,
> > While looking into sphinx3 regression I noticed that vectorizer produces
> > BBs with overall probability count 120%. This patch fixes it.
> > Richi, I don't know how to create a testcase, but having one would
> > be nice.
> >
> > Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> > late email)
>
> This should trigger with sth like
>
> for (i)
> if (cond[i])
> out[i] = 1.;
>
> so a masked store and then using AVX2+. ISTR we disable AVX masked
> stores on zen (but not AVX512).
OK, let me see if I can get a testcase out of that.
> > efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
> > /* Put STORE_BB to likely part. */
> > efalse->probability = profile_probability::unlikely ();
> > + e->probability = efalse->probability.invert ();
> > store_bb->count = efalse->count ();
>
> isn't the count also wrong? Or rather efalse should be likely(). We're
> testing doing
>
> if (!mask all zeros)
> masked-store
>
> because a masked store with all zero mask can end up invoking COW page fault
> handling multiple times (because it doesn't actually write).
Hmm, I only fixed the profile, efalse was already set to unlikely, but
indeed I think it should be likely. Maybe we can compute some bound on
actual probability by knowing if(cond[i]) probability.
If the loop always does factor many ones or zeros, the probability would
remain the same.
If that is p and they are all independent, the outcome would be
(1-p)^factor
sp we know the conditoinal shoul dbe in ragne (1-p)^factor....(1-p),
right?
Honza
>
> Note -Ofast allows store data races and thus does RMW instead of a masked store.
>
> > make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
> > if (dom_info_available_p (CDI_DOMINATORS))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fix optimize_mask_stores profile update
2023-07-17 12:38 ` Jan Hubicka
@ 2023-07-17 13:26 ` Richard Biener
0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2023-07-17 13:26 UTC (permalink / raw)
To: Jan Hubicka; +Cc: gcc-patches
> Am 17.07.2023 um 14:38 schrieb Jan Hubicka <hubicka@ucw.cz>:
>
>
>>
>>> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>>
>>> Hi,
>>> While looking into sphinx3 regression I noticed that vectorizer produces
>>> BBs with overall probability count 120%. This patch fixes it.
>>> Richi, I don't know how to create a testcase, but having one would
>>> be nice.
>>>
>>> Bootstrapped/regtested x86_64-linux, commited last night (sorry for
>>> late email)
>>
>> This should trigger with sth like
>>
>> for (i)
>> if (cond[i])
>> out[i] = 1.;
>>
>> so a masked store and then using AVX2+. ISTR we disable AVX masked
>> stores on zen (but not AVX512).
>
> OK, let me see if I can get a testcase out of that.
>>> efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
>>> /* Put STORE_BB to likely part. */
>>> efalse->probability = profile_probability::unlikely ();
>>> + e->probability = efalse->probability.invert ();
>>> store_bb->count = efalse->count ();
>>
>> isn't the count also wrong? Or rather efalse should be likely(). We're
>> testing doing
>>
>> if (!mask all zeros)
>> masked-store
>>
>> because a masked store with all zero mask can end up invoking COW page fault
>> handling multiple times (because it doesn't actually write).
>
> Hmm, I only fixed the profile, efalse was already set to unlikely, but
> indeed I think it should be likely. Maybe we can compute some bound on
> actual probability by knowing if(cond[i]) probability.
> If the loop always does factor many ones or zeros, the probability would
> remain the same.
> If that is p and they are all independent, the outcome would be
> (1-p)^factor
>
> sp we know the conditoinal shoul dbe in ragne (1-p)^factor....(1-p),
> right?
Yes. I think the heuristic was added for
The case of bigger ranges with all 0/1 for
Purely random one wouldn’t expect all zeros ever in practice. Maybe the probability was also set with that special case in mind (which is of course broken)
Richard
> Honza
>
>>
>> Note -Ofast allows store data races and thus does RMW instead of a masked store.
>>
>>> make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
>>> if (dom_info_available_p (CDI_DOMINATORS))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fix optimize_mask_stores profile update
2023-07-17 10:49 ` Richard Biener
2023-07-17 12:38 ` Jan Hubicka
@ 2023-07-21 17:34 ` Jan Hubicka
2023-07-24 7:28 ` Richard Biener
1 sibling, 1 reply; 6+ messages in thread
From: Jan Hubicka @ 2023-07-21 17:34 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, rguenther
> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi,
> > While looking into sphinx3 regression I noticed that vectorizer produces
> > BBs with overall probability count 120%. This patch fixes it.
> > Richi, I don't know how to create a testcase, but having one would
> > be nice.
> >
> > Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> > late email)
>
> This should trigger with sth like
>
> for (i)
> if (cond[i])
> out[i] = 1.;
>
> so a masked store and then using AVX2+. ISTR we disable AVX masked
> stores on zen (but not AVX512).
Richard,
if we know probability of if (cond[i]) to be p,
then we know that the combined conditional is somewhere between
low = p (the strategy packing true and falses into VF sized
blocks)
and
high = min (p*vf,1)
(the stragegy doing only one true per block if possible)
Likely value is
likely = 1-pow(1-p, vf)
I wonder if we can work out p at least in common cases.
Making store unlikely as we do right now will place it offline with
extra jump. Making it likely is better unless p is very small.
I think if p is close to 0 or 1 which may be common case the analysis
above may be useful. If range [low...high] is small, we can use likely
and keep it as reliable.
If it is high, we can probably just end up with guessed value close but
above 50% so the store stays inline.
Honza
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fix optimize_mask_stores profile update
2023-07-21 17:34 ` Jan Hubicka
@ 2023-07-24 7:28 ` Richard Biener
0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2023-07-24 7:28 UTC (permalink / raw)
To: Jan Hubicka; +Cc: gcc-patches
On Fri, Jul 21, 2023 at 7:34 PM Jan Hubicka <hubicka@ucw.cz> wrote:
>
> > On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > Hi,
> > > While looking into sphinx3 regression I noticed that vectorizer produces
> > > BBs with overall probability count 120%. This patch fixes it.
> > > Richi, I don't know how to create a testcase, but having one would
> > > be nice.
> > >
> > > Bootstrapped/regtested x86_64-linux, commited last night (sorry for
> > > late email)
> >
> > This should trigger with sth like
> >
> > for (i)
> > if (cond[i])
> > out[i] = 1.;
> >
> > so a masked store and then using AVX2+. ISTR we disable AVX masked
> > stores on zen (but not AVX512).
>
> Richard,
> if we know probability of if (cond[i]) to be p,
> then we know that the combined conditional is somewhere between
> low = p (the strategy packing true and falses into VF sized
> blocks)
> and
> high = min (p*vf,1)
> (the stragegy doing only one true per block if possible)
> Likely value is
>
> likely = 1-pow(1-p, vf)
>
> I wonder if we can work out p at least in common cases.
> Making store unlikely as we do right now will place it offline with
> extra jump. Making it likely is better unless p is very small.
>
> I think if p is close to 0 or 1 which may be common case the analysis
> above may be useful. If range [low...high] is small, we can use likely
> and keep it as reliable.
> If it is high, we can probably just end up with guessed value close but
> above 50% so the store stays inline.
I'd say we want to keep the store inline in all cases (we likely lost
any explicit profile info during if-conversion), not sure what we gain
with providing a better "guess" here. So I think we should simply
go with 'likely' derived from statistical independent events.
If in future we can tie if-conversion and vectorization even closer
we might be able to preserve profile data here.
Richard.
>
> Honza
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-07-24 7:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-17 10:35 Fix optimize_mask_stores profile update Jan Hubicka
2023-07-17 10:49 ` Richard Biener
2023-07-17 12:38 ` Jan Hubicka
2023-07-17 13:26 ` Richard Biener
2023-07-21 17:34 ` Jan Hubicka
2023-07-24 7:28 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).