From: Richard Biener <rguenther@suse.de>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Jason Merrill <jason@redhat.com>, Jan Hubicka <jh@suse.cz>,
gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] middle-end, v3: IFN_ASSUME support [PR106654]
Date: Mon, 17 Oct 2022 06:55:40 +0000 (UTC) [thread overview]
Message-ID: <nycvar.YFH.7.77.849.2210170639510.18337@jbgna.fhfr.qr> (raw)
In-Reply-To: <Y0mraujtvgSDjNu4@tucnak>
On Fri, 14 Oct 2022, Jakub Jelinek wrote:
> On Fri, Oct 14, 2022 at 11:27:07AM +0000, Richard Biener wrote:
> > > --- gcc/function.h.jj 2022-10-10 11:57:40.163722972 +0200
> > > +++ gcc/function.h 2022-10-12 19:48:28.887554771 +0200
> > > @@ -438,6 +438,10 @@ struct GTY(()) function {
> > >
> > > /* Set if there are any OMP_TARGET regions in the function. */
> > > unsigned int has_omp_target : 1;
> > > +
> > > + /* Set for artificial function created for [[assume (cond)]].
> > > + These should be GIMPLE optimized, but not expanded to RTL. */
> > > + unsigned int assume_function : 1;
> >
> > I wonder if we should have this along force_output in the symtab
> > node and let the symtab code decide whether to expand?
>
> I actually first had a flag on the symtab node but as the patch shows,
> when it needs to be tested, more frequently I have access to struct function
> than to cgraph node.
I see.
> > > --- gcc/gimplify.cc.jj 2022-10-10 11:57:40.165722944 +0200
> > > +++ gcc/gimplify.cc 2022-10-12 19:48:28.890554730 +0200
> > > @@ -3569,7 +3569,52 @@ gimplify_call_expr (tree *expr_p, gimple
> > > fndecl, 0));
> > > return GS_OK;
> > > }
> > > - /* FIXME: Otherwise expand it specially. */
> > > + /* If not optimizing, ignore the assumptions. */
> > > + if (!optimize)
> > > + {
> > > + *expr_p = NULL_TREE;
> > > + return GS_ALL_DONE;
> > > + }
> > > + /* Temporarily, until gimple lowering, transform
> > > + .ASSUME (cond);
> > > + into:
> > > + guard = .ASSUME ();
> > > + if (guard) goto label_true; else label_false;
> > > + label_true:;
> > > + {
> > > + guard = cond;
> > > + }
> > > + label_false:;
> > > + .ASSUME (guard);
> > > + such that gimple lowering can outline the condition into
> > > + a separate function easily. */
> >
> > So the idea to use lambdas and/or nested functions (for OMP)
> > didn't work out or is more complicated?
>
> Yes, that didn't work out. Both lambda creation and nested function
> handling produce big structures with everything while for the assumptions
> it is better to have separate scalars if possible, lambda creation has
> various language imposed restrictions, diagnostics etc. and isn't
> available in C and I think the outlining in the patch is pretty simple and
> short.
>
> > I wonder if, instead of using the above intermediate form we
> > can have a new structued GIMPLE code with sub-statements
> >
> > .ASSUME
> > {
> > condition;
> > }
>
> That is what I wrote in the patch description as alternative:
> "with the condition wrapped into a GIMPLE_BIND (I admit the above isn't
> extra clean but it is just something to hold it from gimplifier until
> gimple low pass; it reassembles if (condition_never_true) { cond; };
> an alternative would be introduce GOMP_ASSUME statement that would have
> the guard var as operand and the GIMPLE_BIND as body, but for the
> few passes (tree-nested and omp lowering) in between that looked like
> an overkill to me)"
> I can certainly implement that easily.
I'd prefer that, it looks possibly less messy.
> > ? There's gimple_statement_omp conveniently available as base and
> > IIRC you had the requirement to implement some OMP assume as well?
>
> For OpenMP assumptions we right now implement just the holds clause
> of assume and implement it the same way as assume/gnu::assume attributes.
>
> > Of ocurse a different stmt class with body would work as well here,
> > maybe we can even use a gbind with a special flag.
> >
> > The outlining code can then be ajusted to outline a single BIND?
>
> It already is adjusting a single bind (of course with everything nested in
> it).
>
> > It probably won't simplify much that way.
>
> > > +static tree
> > > +create_assumption_fn (location_t loc)
> > > +{
> > > + tree name = clone_function_name_numbered (current_function_decl, "_assume");
> > > + /* For now, will be changed later. */
> >
> > ?
>
> I need to create the FUNCTION_DECL early and only later on discover
> the used automatic vars (for which I need the destination function)
> and only once those are discovered I can create the right
> function type for the function.
>
> > > + tree type = TREE_TYPE (current_function_decl);
>
> > > + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl)
> > > + = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (current_function_decl);
> > > + DECL_FUNCTION_SPECIFIC_TARGET (decl)
> > > + = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
> > > + DECL_FUNCTION_VERSIONED (decl)
> > > + = DECL_FUNCTION_VERSIONED (current_function_decl);
> >
> > what does it mean to copy DECL_FUNCTION_VERSIONED here?
>
> This was a copy and paste from elsewhere (I think OpenMP code).
> I guess I can nuke it and even better add some testcase coverage
> for various nasty things like assume in multi-versioned functions.
>
> > > + DECL_ARGUMENTS (lad.id.dst_fn) = parms;
> > > + TREE_TYPE (lad.id.dst_fn) = build_function_type (boolean_type_node, parmt);
> >
> > ah, here's the type. Maybe use error_mark_node as transitional type?
>
> Will see if that works.
> >
> > > + cgraph_node::add_new_function (lad.id.dst_fn, false);
> > > +
> > > + for (unsigned i = 0; i < sz; ++i)
> > > + {
> > > + tree v = lad.decls[i];
> > > + if (TREE_CODE (v) == SSA_NAME)
> > > + release_ssa_name (v);
> > > + }
> > > +
> > > + stmt = gsi_stmt (*gsi);
> > > + lab = as_a <glabel *> (stmt);
> > > + gcc_assert (gimple_label_label (lab) == l1
> > > + || gimple_label_label (lab) == l2);
> > > + gsi_remove (gsi, true);
> > > + stmt = gsi_stmt (*gsi);
> > > + gcc_assert (gimple_call_internal_p (stmt, IFN_ASSUME)
> > > + && gimple_call_num_args (stmt) == 1
> > > + && gimple_call_arg (stmt, 0) == guard);
> > > + data->cannot_fallthru = false;
> > > + gcall *call = gimple_build_call_internal_vec (IFN_ASSUME, vargs);
> > > + gimple_set_location (call, loc);
> > > + gsi_replace (gsi, call, true);
> > > +}
> > >
> > > /* Lower statement GSI. DATA is passed through the recursion. We try to
> > > track the fallthruness of statements and get rid of unreachable return
> > > @@ -354,6 +738,13 @@ lower_stmt (gimple_stmt_iterator *gsi, s
> > > tree decl = gimple_call_fndecl (stmt);
> > > unsigned i;
> > >
> > > + if (gimple_call_internal_p (stmt, IFN_ASSUME)
> > > + && gimple_call_num_args (stmt) == 0)
> > > + {
> > > + lower_assumption (gsi, data);
> > > + return;
> > > + }
> > > +
> > > for (i = 0; i < gimple_call_num_args (stmt); i++)
> > > {
> > > tree arg = gimple_call_arg (stmt, i);
> > > --- gcc/tree-ssa-ccp.cc.jj 2022-10-10 11:57:40.203722414 +0200
> > > +++ gcc/tree-ssa-ccp.cc 2022-10-12 19:48:28.891554716 +0200
> > > @@ -4253,6 +4253,12 @@ pass_fold_builtins::execute (function *f
> > > }
> > >
> > > callee = gimple_call_fndecl (stmt);
> > > + if (!callee
> > > + && gimple_call_internal_p (stmt, IFN_ASSUME))
> > > + {
> > > + gsi_remove (&i, true);
> > > + continue;
> > > + }
> > > if (!callee || !fndecl_built_in_p (callee, BUILT_IN_NORMAL))
> > > {
> > > gsi_next (&i);
> > > --- gcc/lto-streamer-out.cc.jj 2022-10-10 11:57:40.202722428 +0200
> > > +++ gcc/lto-streamer-out.cc 2022-10-12 19:48:28.891554716 +0200
> > > @@ -2278,6 +2278,7 @@ output_struct_function_base (struct outp
> > > bp_pack_value (&bp, fn->calls_eh_return, 1);
> > > bp_pack_value (&bp, fn->has_force_vectorize_loops, 1);
> > > bp_pack_value (&bp, fn->has_simduid_loops, 1);
> > > + bp_pack_value (&bp, fn->assume_function, 1);
> > > bp_pack_value (&bp, fn->va_list_fpr_size, 8);
> > > bp_pack_value (&bp, fn->va_list_gpr_size, 8);
> > > bp_pack_value (&bp, fn->last_clique, sizeof (short) * 8);
> > > --- gcc/lto-streamer-in.cc.jj 2022-10-10 11:57:40.201722442 +0200
> > > +++ gcc/lto-streamer-in.cc 2022-10-12 19:48:28.891554716 +0200
> > > @@ -1318,6 +1318,7 @@ input_struct_function_base (struct funct
> > > fn->calls_eh_return = bp_unpack_value (&bp, 1);
> > > fn->has_force_vectorize_loops = bp_unpack_value (&bp, 1);
> > > fn->has_simduid_loops = bp_unpack_value (&bp, 1);
> > > + fn->assume_function = bp_unpack_value (&bp, 1);
> > > fn->va_list_fpr_size = bp_unpack_value (&bp, 8);
> > > fn->va_list_gpr_size = bp_unpack_value (&bp, 8);
> > > fn->last_clique = bp_unpack_value (&bp, sizeof (short) * 8);
> > > --- gcc/cgraphunit.cc.jj 2022-10-10 11:57:40.152723125 +0200
> > > +++ gcc/cgraphunit.cc 2022-10-12 19:48:28.892554703 +0200
> > > @@ -1882,6 +1882,16 @@ cgraph_node::expand (void)
> > > ggc_collect ();
> > > timevar_pop (TV_REST_OF_COMPILATION);
> > >
> > > + if (DECL_STRUCT_FUNCTION (decl)
> > > + && DECL_STRUCT_FUNCTION (decl)->assume_function)
> > > + {
> > > + /* Assume functions aren't expanded into RTL, on the other side
> > > + we don't want to release their body. */
> > > + if (cfun)
> > > + pop_cfun ();
> > > + return;
> > > + }
> > > +
> > > /* Make sure that BE didn't give up on compiling. */
> > > gcc_assert (TREE_ASM_WRITTEN (decl));
> > > if (cfun)
> > > --- gcc/cfgexpand.cc.jj 2022-10-10 11:57:40.152723125 +0200
> > > +++ gcc/cfgexpand.cc 2022-10-12 19:48:28.893554689 +0200
> > > @@ -6597,6 +6597,14 @@ pass_expand::execute (function *fun)
> > > rtx_insn *var_seq, *var_ret_seq;
> > > unsigned i;
> > >
> > > + if (cfun->assume_function)
> > > + {
> > > + /* Assume functions should not be expanded to RTL. */
> >
> > can we avoid getting here in the first place? I think we don't need
> > any of the post-pass_all_optimizations[_g] passes?
>
> I'm afraid not without revamping passes.def, because
> to easily cat the pass queue from certain point onwards, we
> need all the remaining passes to be wrapped with
> PUSH_INSERT_PASSES_WITHIN.
> So, if we e.g. wanted to cut out everything from pass_tm_init
> onwards, we'd need to wrap:
> NEXT_PASS (pass_tm_init);
> PUSH_INSERT_PASSES_WITHIN (pass_tm_init)
> NEXT_PASS (pass_tm_mark);
> NEXT_PASS (pass_tm_memopt);
> NEXT_PASS (pass_tm_edges);
> POP_INSERT_PASSES ()
> NEXT_PASS (pass_simduid_cleanup);
> NEXT_PASS (pass_vtable_verify);
> NEXT_PASS (pass_lower_vaarg);
> NEXT_PASS (pass_lower_vector);
> NEXT_PASS (pass_lower_complex_O0);
> NEXT_PASS (pass_sancov_O0);
> NEXT_PASS (pass_lower_switch_O0);
> NEXT_PASS (pass_asan_O0);
> NEXT_PASS (pass_tsan_O0);
> NEXT_PASS (pass_sanopt);
> NEXT_PASS (pass_cleanup_eh);
> NEXT_PASS (pass_lower_resx);
> NEXT_PASS (pass_nrv);
> NEXT_PASS (pass_gimple_isel);
> NEXT_PASS (pass_harden_conditional_branches);
> NEXT_PASS (pass_harden_compares);
> NEXT_PASS (pass_warn_access, /*early=*/false);
> NEXT_PASS (pass_cleanup_cfg_post_optimizing);
> NEXT_PASS (pass_warn_function_noreturn);
>
> NEXT_PASS (pass_expand);
> in some wrapper pass with a gate (either also including
> pass_rest_of_compilation but that would mean undesirable
> reindentation of everything there, or just
> the above ones and have assume_function punt in the 2
> or 1 gates).
Ah, they are all in all_passes :/ Maybe we can add something
like TODO_discard_function (or a property) that will not discard
the function but stop compiling it? I wonder if all cleanup
is properly done for the function - I suppose we want to keep the
body around for callers indefinitely.
> What I had in the patch was just skip pass_expand
> and pass_rest_of_compilation, perhaps another possibility
> to do the former would be to define a gate on pass_expand.
Or some gate in the pass manager, like in override_gate_status
check fun->properties & PROP_suspended and have some
pass_suspend_assume add that property for all assume function
bodies.
In case you like any of the above give it a shot, otherwise what
you have isn't too bad, I just wondered if there's a nicer way.
> > > --- gcc/tree-vectorizer.cc.jj 2022-10-10 11:57:40.204722400 +0200
> > > +++ gcc/tree-vectorizer.cc 2022-10-12 19:48:28.894554675 +0200
> > > @@ -1213,6 +1213,10 @@ public:
> > > /* opt_pass methods: */
> > > bool gate (function *fun) final override
> > > {
> > > + /* Vectorization makes range analysis of assume functions
> > > + harder, not easier. */
> > > + if (fun->assume_function)
> > > + return false;
> > > return flag_tree_loop_vectorize || fun->has_force_vectorize_loops;
> > > }
> > >
> > > @@ -1490,7 +1494,14 @@ public:
> > >
> > > /* opt_pass methods: */
> > > opt_pass * clone () final override { return new pass_slp_vectorize (m_ctxt); }
> > > - bool gate (function *) final override { return flag_tree_slp_vectorize != 0; }
> > > + bool gate (function *fun) final override
> > > + {
> > > + /* Vectorization makes range analysis of assume functions harder,
> > > + not easier. */
> >
> > Can we split out these kind of considerations from the initial patch?
>
> Sure.
> >
> > > + if (fun->assume_function)
> > > + return false;
> > > + return flag_tree_slp_vectorize != 0;
> > > + }
> > > unsigned int execute (function *) final override;
> > >
> > > }; // class pass_slp_vectorize
> > > --- gcc/ipa-icf.cc.jj 2022-10-10 11:57:40.201722442 +0200
> > > +++ gcc/ipa-icf.cc 2022-10-12 19:48:28.894554675 +0200
> > > @@ -1517,6 +1517,9 @@ sem_function::parse (cgraph_node *node,
> > > if (!func || (!node->has_gimple_body_p () && !node->thunk))
> > > return NULL;
> > >
> > > + if (func->assume_function)
> > > + return NULL;
> > > +
> >
> > Do we want implicit noipa attribute on the assume functions? Or do
> > we need to IPA-CP into them? I suppose the ranger code can use
> > contextual code from the .ASSUME call for things like
> > assume (side-effect(), i == 1);
>
> Most of normal IPA optimizations are disabled for them because
> they aren't normally called, all we do is take their address
> and pass it to .ASSUME. IPA-ICF was an exception and I had to
> disable it because when it triggered it decided to create a thunk
> which failed to assemble.
> But implicit noipa surely is an option; though of course we want
> inlining etc. to happen into those functions.
> And eventually some kind of IPA SRA of their arguments but with
> different behavior from normal IPA-SRA.
I suppose for now adding noipa is easiest, we'd still inline into
the body of course.
> > Reading some of the patch I guessed you wanted to handle nested
> > assumes. So - is
> >
> > [[assume (a == 4 && ([[assume(b == 3)]], b != 2))]]
> >
> > a thing?
>
> This is not valid, assume can be just on an empty statement.
> But with GNU statement expressions it is a thing and I should add it to
> testsuite coverage.
>
> void
> foo (int a, int b)
> {
> [[assume (a == 4 && ({ [[assume (b == 3)]]; b != 2 }))]];
> }
> is valid.
> I think the code should handle it fine (outline the outer and the new
> outlined function enters pass queue with all_lowering_passes
> and so will see pass_lower_cf again and hopefully work. Will see
> how it goes when I tweak the patch.
Thanks,
Richard.
next prev parent reply other threads:[~2022-10-17 6:55 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-10 8:54 [PATCH] middle-end " Jakub Jelinek
2022-10-10 21:09 ` Jason Merrill
2022-10-10 21:19 ` Jakub Jelinek
2022-10-11 13:36 ` [PATCH] middle-end, v2: " Jakub Jelinek
2022-10-12 15:48 ` Jason Merrill
2022-10-13 6:50 ` [PATCH] middle-end, v3: " Jakub Jelinek
2022-10-14 11:27 ` Richard Biener
2022-10-14 18:33 ` Jakub Jelinek
2022-10-17 6:55 ` Richard Biener [this message]
2022-10-17 15:44 ` [PATCH] middle-end, v4: " Jakub Jelinek
2022-10-18 7:00 ` Richard Biener
2022-10-18 21:31 ` Andrew MacLeod
2022-10-19 16:06 ` Jakub Jelinek
2022-10-19 16:55 ` Andrew MacLeod
2022-10-19 17:39 ` Jakub Jelinek
2022-10-19 17:41 ` Jakub Jelinek
2022-10-19 18:25 ` Andrew MacLeod
2022-10-19 17:14 ` Andrew MacLeod
2022-10-11 18:05 ` [PATCH] middle-end " Andrew MacLeod
2022-10-12 10:15 ` Jakub Jelinek
2022-10-12 14:31 ` Andrew MacLeod
2022-10-12 14:39 ` Jakub Jelinek
2022-10-12 16:12 ` Andrew MacLeod
2022-10-13 8:11 ` Richard Biener
2022-10-13 9:53 ` Jakub Jelinek
2022-10-13 13:16 ` Andrew MacLeod
2022-10-13 9:57 ` Jakub Jelinek
2022-10-17 17:53 ` Andrew MacLeod
2022-10-14 20:43 ` Martin Uecker
2022-10-14 21:20 ` Jakub Jelinek
2022-10-15 8:07 ` Martin Uecker
2022-10-15 8:53 ` Jakub Jelinek
2022-10-17 5:52 ` Martin Uecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.77.849.2210170639510.18337@jbgna.fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=jason@redhat.com \
--cc=jh@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).