From: Richard Biener <rguenther@suse.de>
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: gcc-patches@gcc.gnu.org, Jakub Jelinek <jakub@redhat.com>,
Jan Hubicka <hubicka@ucw.cz>
Subject: Re: [PATCH] tree-optimization/104912 - ensure cost model is checked first
Date: Thu, 31 Mar 2022 15:26:05 +0200 (CEST) [thread overview]
Message-ID: <73465r1r-22p-p279-901n-n3pr603139o9@fhfr.qr> (raw)
In-Reply-To: <mptpmm2wa8h.fsf@arm.com>
On Thu, 31 Mar 2022, Richard Sandiford wrote:
> Richard Biener <rguenther@suse.de> writes:
> > The following makes sure that when we build the versioning condition
> > for vectorization including the cost model check, we check for the
> > cost model and branch over other versioning checks. That is what
> > the cost modeling assumes, since the cost model check is the only
> > one accounted for in the scalar outside cost. Currently we emit
> > all checks as straight-line code combined with bitwise ops which
> > can result in surprising ordering of checks in the final assembly.
>
> Yeah, this had bugged me too, and meant that we made some bad
> decisions in some of the local benchmarks we use. Was just afraid
> to poke at it, since it seemed like a deliberate decision. :-)
I guess it was rather giving up because of the way our loop-versioning
infrastructure works ...
> > Since loop_version accepts only a single versioning condition
> > the splitting is done after the fact.
> >
> > The result is a 1.5% speedup of 416.gamess on x86_64 when compiling
> > with -Ofast and tuning for generic or skylake. That's not enough
> > to recover from the slowdown when vectorizing but it now cuts off
> > the expensive alias versioning test.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > OK for trunk?
> >
> > For the rest of the regression my plan is to somehow factor in
> > the evolution of the number of iterations in the outer loop
> > (which is {1, +, 1}) to somehow bump the static profitability
> > estimate and together with the "cheap" cost model check never
> > execute the vectorized version (well, it is actually never executed,
> > but only because the alias check fails).
> >
> > Thanks,
> > Richard.
> >
> > 2022-03-21 Richard Biener <rguenther@suse.de>
> >
> > PR tree-optimization/104912
> > * tree-vect-loop-manip.cc (vect_loop_versioning): Split
> > the cost model check to a separate BB to make sure it is
> > checked first and not combined with other version checks.
> > ---
> > gcc/tree-vect-loop-manip.cc | 53 ++++++++++++++++++++++++++++++++++---
> > 1 file changed, 50 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > index a7bbc916bbc..8ef333eb31b 100644
> > --- a/gcc/tree-vect-loop-manip.cc
> > +++ b/gcc/tree-vect-loop-manip.cc
> > @@ -3445,13 +3445,28 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
> > cond_expr = expr;
> > }
> >
> > + tree cost_name = NULL_TREE;
> > + if (cond_expr
> > + && !integer_truep (cond_expr)
> > + && (version_niter
> > + || version_align
> > + || version_alias
> > + || version_simd_if_cond))
> > + cost_name = cond_expr = force_gimple_operand_1 (unshare_expr (cond_expr),
> > + &cond_expr_stmt_list,
> > + is_gimple_val, NULL_TREE);
> > +
> > if (version_niter)
> > vect_create_cond_for_niters_checks (loop_vinfo, &cond_expr);
> >
> > if (cond_expr)
> > - cond_expr = force_gimple_operand_1 (unshare_expr (cond_expr),
> > - &cond_expr_stmt_list,
> > - is_gimple_condexpr, NULL_TREE);
> > + {
> > + gimple_seq tem = NULL;
> > + cond_expr = force_gimple_operand_1 (unshare_expr (cond_expr),
> > + &tem,
> > + is_gimple_condexpr, NULL_TREE);
> > + gimple_seq_add_seq (&cond_expr_stmt_list, tem);
> > + }
> >
> > if (version_align)
> > vect_create_cond_for_align_checks (loop_vinfo, &cond_expr,
> > @@ -3654,6 +3669,38 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
> > update_ssa (TODO_update_ssa);
> > }
> >
> > + /* Split the cost model check off to a separate BB. Costing assumes
> > + this is the only thing we perform when we enter the scalar loop. */
>
> Maybe “…from a failed cost decision” or something? Might sounds out
> of context like it applied more generally.
Fixed.
> > + if (cost_name)
> > + {
> > + gimple *def = SSA_NAME_DEF_STMT (cost_name);
>
> I realise it should only happen very rarely if at all, but is it
> absolutely guaranteed that the cost condition doesn't fold to a
> constant?
OK, better be safe than sorry - I added a SSA_NAME check.
> > + /* All uses of the cost check are 'true' after the check we
> > + are going to insert. */
> > + replace_uses_by (cost_name, boolean_true_node);
> > + /* And we're going to build the new single use of it. */
> > + gcond *cond = gimple_build_cond (NE_EXPR, cost_name, boolean_false_node,
> > + NULL_TREE, NULL_TREE);
> > + edge e = split_block (gimple_bb (def), def);
> > + gimple_stmt_iterator gsi = gsi_for_stmt (def);
> > + gsi_insert_after (&gsi, cond, GSI_NEW_STMT);
> > + edge true_e, false_e;
> > + extract_true_false_edges_from_block (e->dest, &true_e, &false_e);
> > + e->flags &= ~EDGE_FALLTHRU;
> > + e->flags |= EDGE_TRUE_VALUE;
> > + edge e2 = make_edge (e->src, false_e->dest, EDGE_FALSE_VALUE);
> > + e->probability = prob;
> > + e2->probability = prob.invert ();
>
> Is reusing prob the right thing to do? Wouldn't the path to the vector
> loop end up with a probability of PROB^2?
We're statically using profile_probability::likely () for the
vector path. But you are right, and we scale loop frequencies with
prop. So when we replace if () { with-prob A } with
if () if () { with-prob A } then we probably want both true edges
to have a probability of sqrt (prob) and the false edges sqrt (~prob)?
It _looks_ like profile_probability::split (prob) would do the
trick. Here we replace if (X) with if (X && Y) with overall
likely() probability.
So something like (incremental):
@@ -3446,15 +3446,21 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
}
tree cost_name = NULL_TREE;
+ profile_probability prob2 = profile_probability::uninitialized ();
if (cond_expr
&& !integer_truep (cond_expr)
&& (version_niter
|| version_align
|| version_alias
|| version_simd_if_cond))
- cost_name = cond_expr = force_gimple_operand_1 (unshare_expr
(cond_expr),
- &cond_expr_stmt_list,
- is_gimple_val,
NULL_TREE);
+ {
+ cost_name = cond_expr = force_gimple_operand_1 (unshare_expr
(cond_expr),
+
&cond_expr_stmt_list,
+ is_gimple_val,
NULL_TREE);
+ /* Split prob () into two so that the overall probability of
passing
+ both the cost-model and versioning checks is the orig prob. */
+ prob2 = prob.split (prob);
+ }
if (version_niter)
vect_create_cond_for_niters_checks (loop_vinfo, &cond_expr);
@@ -3688,8 +3695,8 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
e->flags &= ~EDGE_FALLTHRU;
e->flags |= EDGE_TRUE_VALUE;
edge e2 = make_edge (e->src, false_e->dest, EDGE_FALSE_VALUE);
- e->probability = prob;
- e2->probability = prob.invert ();
+ e->probability = prob2;
+ e2->probability = prob2.invert ();
set_immediate_dominator (CDI_DOMINATORS, false_e->dest, e->src);
auto_vec<basic_block, 3> adj;
for (basic_block son = first_dom_son (CDI_DOMINATORS, e->dest);
might do the trick? Honza?
> > + set_immediate_dominator (CDI_DOMINATORS, false_e->dest, e->src);
> > + auto_vec<basic_block, 3> adj;
> > + for (basic_block son = first_dom_son (CDI_DOMINATORS, e->dest);
> > + son;
> > + son = next_dom_son (CDI_DOMINATORS, son))
> > + if (EDGE_COUNT (son->preds) > 1)
> > + adj.safe_push (son);
> > + for (auto son : adj)
> > + set_immediate_dominator (CDI_DOMINATORS, son, e->src);
>
> Seems unfortunate that so much bespoke code is needed to do an operation
> like this, but that's not a very constructive comment, sorry. :-)
Yeah :/
> Anyway, LGTM, but I don't think I'd really be able to spot problems.
Let's see what Honza thinks about the probability issue.
Thanks,
Richard.
> Thanks,
> Richard
>
> > + }
> > +
> > if (version_niter)
> > {
> > /* The versioned loop could be infinite, we need to clear existing
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
prev parent reply other threads:[~2022-03-31 13:26 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-21 15:10 Richard Biener
2022-03-31 12:57 ` Richard Sandiford
2022-03-31 13:26 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=73465r1r-22p-p279-901n-n3pr603139o9@fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=hubicka@ucw.cz \
--cc=jakub@redhat.com \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).