Re: [PATCH] middle-end, v3: IFN_ASSUME support [PR106654]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jakub Jelinek <jakub@redhat.com>
To: Richard Biener <rguenther@suse.de>
Cc: Jason Merrill <jason@redhat.com>, Jan Hubicka <jh@suse.cz>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] middle-end, v3: IFN_ASSUME support [PR106654]
Date: Fri, 14 Oct 2022 20:33:14 +0200	[thread overview]
Message-ID: <Y0mraujtvgSDjNu4@tucnak> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2210141103070.18337@jbgna.fhfr.qr>

On Fri, Oct 14, 2022 at 11:27:07AM +0000, Richard Biener wrote:
> > --- gcc/function.h.jj	2022-10-10 11:57:40.163722972 +0200
> > +++ gcc/function.h	2022-10-12 19:48:28.887554771 +0200
> > @@ -438,6 +438,10 @@ struct GTY(()) function {
> >  
> >    /* Set if there are any OMP_TARGET regions in the function.  */
> >    unsigned int has_omp_target : 1;
> > +
> > +  /* Set for artificial function created for [[assume (cond)]].
> > +     These should be GIMPLE optimized, but not expanded to RTL.  */
> > +  unsigned int assume_function : 1;
> 
> I wonder if we should have this along force_output in the symtab
> node and let the symtab code decide whether to expand?

I actually first had a flag on the symtab node but as the patch shows,
when it needs to be tested, more frequently I have access to struct function
than to cgraph node.

> > --- gcc/gimplify.cc.jj	2022-10-10 11:57:40.165722944 +0200
> > +++ gcc/gimplify.cc	2022-10-12 19:48:28.890554730 +0200
> > @@ -3569,7 +3569,52 @@ gimplify_call_expr (tree *expr_p, gimple
> >  						     fndecl, 0));
> >  	      return GS_OK;
> >  	    }
> > -	  /* FIXME: Otherwise expand it specially.  */
> > +	  /* If not optimizing, ignore the assumptions.  */
> > +	  if (!optimize)
> > +	    {
> > +	      *expr_p = NULL_TREE;
> > +	      return GS_ALL_DONE;
> > +	    }
> > +	  /* Temporarily, until gimple lowering, transform
> > +	     .ASSUME (cond);
> > +	     into:
> > +	     guard = .ASSUME ();
> > +	     if (guard) goto label_true; else label_false;
> > +	     label_true:;
> > +	     {
> > +	       guard = cond;
> > +	     }
> > +	     label_false:;
> > +	     .ASSUME (guard);
> > +	     such that gimple lowering can outline the condition into
> > +	     a separate function easily.  */
> 
> So the idea to use lambdas and/or nested functions (for OMP)
> didn't work out or is more complicated?

Yes, that didn't work out.  Both lambda creation and nested function
handling produce big structures with everything while for the assumptions
it is better to have separate scalars if possible, lambda creation has
various language imposed restrictions, diagnostics etc. and isn't
available in C and I think the outlining in the patch is pretty simple and
short.

> I wonder if, instead of using the above intermediate form we
> can have a new structued GIMPLE code with sub-statements
> 
>  .ASSUME
>    {
>      condition;
>    }

That is what I wrote in the patch description as alternative:
"with the condition wrapped into a GIMPLE_BIND (I admit the above isn't                                                                                                                
extra clean but it is just something to hold it from gimplifier until                                                                                                                 
gimple low pass; it reassembles if (condition_never_true) { cond; };                                                                                                                  
an alternative would be introduce GOMP_ASSUME statement that would have                                                                                                               
the guard var as operand and the GIMPLE_BIND as body, but for the                                                                                                                     
few passes (tree-nested and omp lowering) in between that looked like                                                                                                                 
an overkill to me)"
I can certainly implement that easily.

> ?  There's gimple_statement_omp conveniently available as base and
> IIRC you had the requirement to implement some OMP assume as well?

For OpenMP assumptions we right now implement just the holds clause
of assume and implement it the same way as assume/gnu::assume attributes.

> Of ocurse a different stmt class with body would work as well here,
> maybe we can even use a gbind with a special flag.
> 
> The outlining code can then be ajusted to outline a single BIND?

It already is adjusting a single bind (of course with everything nested in
it).

> It probably won't simplify much that way.

> > +static tree
> > +create_assumption_fn (location_t loc)
> > +{
> > +  tree name = clone_function_name_numbered (current_function_decl, "_assume");
> > +  /* For now, will be changed later.  */
> 
> ?

I need to create the FUNCTION_DECL early and only later on discover
the used automatic vars (for which I need the destination function)
and only once those are discovered I can create the right
function type for the function.

> > +  tree type = TREE_TYPE (current_function_decl);

> > +  DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl)
> > +    = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (current_function_decl);
> > +  DECL_FUNCTION_SPECIFIC_TARGET (decl)
> > +    = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
> > +  DECL_FUNCTION_VERSIONED (decl)
> > +    = DECL_FUNCTION_VERSIONED (current_function_decl);
> 
> what does it mean to copy DECL_FUNCTION_VERSIONED here?

This was a copy and paste from elsewhere (I think OpenMP code).
I guess I can nuke it and even better add some testcase coverage
for various nasty things like assume in multi-versioned functions.
> > +  DECL_ARGUMENTS (lad.id.dst_fn) = parms;
> > +  TREE_TYPE (lad.id.dst_fn) = build_function_type (boolean_type_node, parmt);
> 
> ah, here's the type.  Maybe use error_mark_node as transitional type?

Will see if that works.
> 
> > +  cgraph_node::add_new_function (lad.id.dst_fn, false);
> > +
> > +  for (unsigned i = 0; i < sz; ++i)
> > +    {
> > +      tree v = lad.decls[i];
> > +      if (TREE_CODE (v) == SSA_NAME)
> > +	release_ssa_name (v);
> > +    }
> > +
> > +  stmt = gsi_stmt (*gsi);
> > +  lab = as_a <glabel *> (stmt);
> > +  gcc_assert (gimple_label_label (lab) == l1
> > +	      || gimple_label_label (lab) == l2);
> > +  gsi_remove (gsi, true);
> > +  stmt = gsi_stmt (*gsi);
> > +  gcc_assert (gimple_call_internal_p (stmt, IFN_ASSUME)
> > +	      && gimple_call_num_args (stmt) == 1
> > +	      && gimple_call_arg (stmt, 0) == guard);
> > +  data->cannot_fallthru = false;
> > +  gcall *call = gimple_build_call_internal_vec (IFN_ASSUME, vargs);
> > +  gimple_set_location (call, loc);
> > +  gsi_replace (gsi, call, true);
> > +}
> >  
> >  /* Lower statement GSI.  DATA is passed through the recursion.  We try to
> >     track the fallthruness of statements and get rid of unreachable return
> > @@ -354,6 +738,13 @@ lower_stmt (gimple_stmt_iterator *gsi, s
> >  	tree decl = gimple_call_fndecl (stmt);
> >  	unsigned i;
> >  
> > +	if (gimple_call_internal_p (stmt, IFN_ASSUME)
> > +	    && gimple_call_num_args (stmt) == 0)
> > +	  {
> > +	    lower_assumption (gsi, data);
> > +	    return;
> > +	  }
> > +
> >  	for (i = 0; i < gimple_call_num_args (stmt); i++)
> >  	  {
> >  	    tree arg = gimple_call_arg (stmt, i);
> > --- gcc/tree-ssa-ccp.cc.jj	2022-10-10 11:57:40.203722414 +0200
> > +++ gcc/tree-ssa-ccp.cc	2022-10-12 19:48:28.891554716 +0200
> > @@ -4253,6 +4253,12 @@ pass_fold_builtins::execute (function *f
> >  	    }
> >  
> >  	  callee = gimple_call_fndecl (stmt);
> > +	  if (!callee
> > +	      && gimple_call_internal_p (stmt, IFN_ASSUME))
> > +	    {
> > +	      gsi_remove (&i, true);
> > +	      continue;
> > +	    }
> >  	  if (!callee || !fndecl_built_in_p (callee, BUILT_IN_NORMAL))
> >  	    {
> >  	      gsi_next (&i);
> > --- gcc/lto-streamer-out.cc.jj	2022-10-10 11:57:40.202722428 +0200
> > +++ gcc/lto-streamer-out.cc	2022-10-12 19:48:28.891554716 +0200
> > @@ -2278,6 +2278,7 @@ output_struct_function_base (struct outp
> >    bp_pack_value (&bp, fn->calls_eh_return, 1);
> >    bp_pack_value (&bp, fn->has_force_vectorize_loops, 1);
> >    bp_pack_value (&bp, fn->has_simduid_loops, 1);
> > +  bp_pack_value (&bp, fn->assume_function, 1);
> >    bp_pack_value (&bp, fn->va_list_fpr_size, 8);
> >    bp_pack_value (&bp, fn->va_list_gpr_size, 8);
> >    bp_pack_value (&bp, fn->last_clique, sizeof (short) * 8);
> > --- gcc/lto-streamer-in.cc.jj	2022-10-10 11:57:40.201722442 +0200
> > +++ gcc/lto-streamer-in.cc	2022-10-12 19:48:28.891554716 +0200
> > @@ -1318,6 +1318,7 @@ input_struct_function_base (struct funct
> >    fn->calls_eh_return = bp_unpack_value (&bp, 1);
> >    fn->has_force_vectorize_loops = bp_unpack_value (&bp, 1);
> >    fn->has_simduid_loops = bp_unpack_value (&bp, 1);
> > +  fn->assume_function = bp_unpack_value (&bp, 1);
> >    fn->va_list_fpr_size = bp_unpack_value (&bp, 8);
> >    fn->va_list_gpr_size = bp_unpack_value (&bp, 8);
> >    fn->last_clique = bp_unpack_value (&bp, sizeof (short) * 8);
> > --- gcc/cgraphunit.cc.jj	2022-10-10 11:57:40.152723125 +0200
> > +++ gcc/cgraphunit.cc	2022-10-12 19:48:28.892554703 +0200
> > @@ -1882,6 +1882,16 @@ cgraph_node::expand (void)
> >    ggc_collect ();
> >    timevar_pop (TV_REST_OF_COMPILATION);
> >  
> > +  if (DECL_STRUCT_FUNCTION (decl)
> > +      && DECL_STRUCT_FUNCTION (decl)->assume_function)
> > +    {
> > +      /* Assume functions aren't expanded into RTL, on the other side
> > +	 we don't want to release their body.  */
> > +      if (cfun)
> > +	pop_cfun ();
> > +      return;
> > +    }
> > +
> >    /* Make sure that BE didn't give up on compiling.  */
> >    gcc_assert (TREE_ASM_WRITTEN (decl));
> >    if (cfun)
> > --- gcc/cfgexpand.cc.jj	2022-10-10 11:57:40.152723125 +0200
> > +++ gcc/cfgexpand.cc	2022-10-12 19:48:28.893554689 +0200
> > @@ -6597,6 +6597,14 @@ pass_expand::execute (function *fun)
> >    rtx_insn *var_seq, *var_ret_seq;
> >    unsigned i;
> >  
> > +  if (cfun->assume_function)
> > +    {
> > +      /* Assume functions should not be expanded to RTL.  */
> 
> can we avoid getting here in the first place?  I think we don't need
> any of the post-pass_all_optimizations[_g] passes?

I'm afraid not without revamping passes.def, because
to easily cat the pass queue from certain point onwards, we
need all the remaining passes to be wrapped with
PUSH_INSERT_PASSES_WITHIN.
So, if we e.g. wanted to cut out everything from pass_tm_init
onwards, we'd need to wrap:
  NEXT_PASS (pass_tm_init);
  PUSH_INSERT_PASSES_WITHIN (pass_tm_init)
      NEXT_PASS (pass_tm_mark);
      NEXT_PASS (pass_tm_memopt);
      NEXT_PASS (pass_tm_edges);
  POP_INSERT_PASSES ()
  NEXT_PASS (pass_simduid_cleanup);
  NEXT_PASS (pass_vtable_verify);
  NEXT_PASS (pass_lower_vaarg);
  NEXT_PASS (pass_lower_vector);
  NEXT_PASS (pass_lower_complex_O0);
  NEXT_PASS (pass_sancov_O0);
  NEXT_PASS (pass_lower_switch_O0);
  NEXT_PASS (pass_asan_O0);
  NEXT_PASS (pass_tsan_O0);
  NEXT_PASS (pass_sanopt);
  NEXT_PASS (pass_cleanup_eh);
  NEXT_PASS (pass_lower_resx);
  NEXT_PASS (pass_nrv);
  NEXT_PASS (pass_gimple_isel);
  NEXT_PASS (pass_harden_conditional_branches);
  NEXT_PASS (pass_harden_compares);
  NEXT_PASS (pass_warn_access, /*early=*/false);
  NEXT_PASS (pass_cleanup_cfg_post_optimizing);
  NEXT_PASS (pass_warn_function_noreturn);

  NEXT_PASS (pass_expand);
in some wrapper pass with a gate (either also including
pass_rest_of_compilation but that would mean undesirable
reindentation of everything there, or just
the above ones and have assume_function punt in the 2
or 1 gates).

What I had in the patch was just skip pass_expand
and pass_rest_of_compilation, perhaps another possibility
to do the former would be to define a gate on pass_expand.
> > --- gcc/tree-vectorizer.cc.jj	2022-10-10 11:57:40.204722400 +0200
> > +++ gcc/tree-vectorizer.cc	2022-10-12 19:48:28.894554675 +0200
> > @@ -1213,6 +1213,10 @@ public:
> >    /* opt_pass methods: */
> >    bool gate (function *fun) final override
> >      {
> > +      /* Vectorization makes range analysis of assume functions
> > +	 harder, not easier.  */
> > +      if (fun->assume_function)
> > +	return false;
> >        return flag_tree_loop_vectorize || fun->has_force_vectorize_loops;
> >      }
> >  
> > @@ -1490,7 +1494,14 @@ public:
> >  
> >    /* opt_pass methods: */
> >    opt_pass * clone () final override { return new pass_slp_vectorize (m_ctxt); }
> > -  bool gate (function *) final override { return flag_tree_slp_vectorize != 0; }
> > +  bool gate (function *fun) final override
> > +  {
> > +    /* Vectorization makes range analysis of assume functions harder,
> > +       not easier.  */
> 
> Can we split out these kind of considerations from the initial patch?

Sure.
> 
> > +    if (fun->assume_function)
> > +      return false;
> > +    return flag_tree_slp_vectorize != 0;
> > +  }
> >    unsigned int execute (function *) final override;
> >  
> >  }; // class pass_slp_vectorize
> > --- gcc/ipa-icf.cc.jj	2022-10-10 11:57:40.201722442 +0200
> > +++ gcc/ipa-icf.cc	2022-10-12 19:48:28.894554675 +0200
> > @@ -1517,6 +1517,9 @@ sem_function::parse (cgraph_node *node,
> >    if (!func || (!node->has_gimple_body_p () && !node->thunk))
> >      return NULL;
> >  
> > +  if (func->assume_function)
> > +    return NULL;
> > +
> 
> Do we want implicit noipa attribute on the assume functions?  Or do
> we need to IPA-CP into them?  I suppose the ranger code can use
> contextual code from the .ASSUME call for things like
> assume (side-effect(), i == 1);

Most of normal IPA optimizations are disabled for them because
they aren't normally called, all we do is take their address
and pass it to .ASSUME.  IPA-ICF was an exception and I had to
disable it because when it triggered it decided to create a thunk
which failed to assemble.
But implicit noipa surely is an option; though of course we want
inlining etc. to happen into those functions.
And eventually some kind of IPA SRA of their arguments but with
different behavior from normal IPA-SRA.

> Reading some of the patch I guessed you wanted to handle nested
> assumes.  So - is
> 
> [[assume (a == 4 && ([[assume(b == 3)]], b != 2))]]
> 
> a thing?

This is not valid, assume can be just on an empty statement.
But with GNU statement expressions it is a thing and I should add it to
testsuite coverage.

void
foo (int a, int b)
{
  [[assume (a == 4 && ({ [[assume (b == 3)]]; b != 2 }))]];
}
is valid.
I think the code should handle it fine (outline the outer and the new
outlined function enters pass queue with all_lowering_passes
and so will see pass_lower_cf again and hopefully work.  Will see
how it goes when I tweak the patch.

	Jakub

next prev parent reply	other threads:[~2022-10-14 18:34 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-10  8:54 [PATCH] middle-end " Jakub Jelinek
2022-10-10 21:09 ` Jason Merrill
2022-10-10 21:19   ` Jakub Jelinek
2022-10-11 13:36     ` [PATCH] middle-end, v2: " Jakub Jelinek
2022-10-12 15:48       ` Jason Merrill
2022-10-13  6:50         ` [PATCH] middle-end, v3: " Jakub Jelinek
2022-10-14 11:27           ` Richard Biener
2022-10-14 18:33             ` Jakub Jelinek [this message]
2022-10-17  6:55               ` Richard Biener
2022-10-17 15:44             ` [PATCH] middle-end, v4: " Jakub Jelinek
2022-10-18  7:00               ` Richard Biener
2022-10-18 21:31               ` Andrew MacLeod
2022-10-19 16:06                 ` Jakub Jelinek
2022-10-19 16:55                   ` Andrew MacLeod
2022-10-19 17:39                     ` Jakub Jelinek
2022-10-19 17:41                       ` Jakub Jelinek
2022-10-19 18:25                         ` Andrew MacLeod
2022-10-19 17:14                   ` Andrew MacLeod
2022-10-11 18:05 ` [PATCH] middle-end " Andrew MacLeod
2022-10-12 10:15   ` Jakub Jelinek
2022-10-12 14:31     ` Andrew MacLeod
2022-10-12 14:39       ` Jakub Jelinek
2022-10-12 16:12         ` Andrew MacLeod
2022-10-13  8:11           ` Richard Biener
2022-10-13  9:53             ` Jakub Jelinek
2022-10-13 13:16               ` Andrew MacLeod
2022-10-13  9:57           ` Jakub Jelinek
2022-10-17 17:53     ` Andrew MacLeod
2022-10-14 20:43 ` Martin Uecker
2022-10-14 21:20   ` Jakub Jelinek
2022-10-15  8:07     ` Martin Uecker
2022-10-15  8:53       ` Jakub Jelinek
2022-10-17  5:52         ` Martin Uecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0mraujtvgSDjNu4@tucnak \
    --to=jakub@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    --cc=jh@suse.cz \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).