Re: Basic kill analysis for modref

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <rguenther@suse.de>
To: Jan Hubicka <hubicka@kam.mff.cuni.cz>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Basic kill analysis for modref
Date: Fri, 12 Nov 2021 11:38:56 +0100 (CET)	[thread overview]
Message-ID: <o5r62o97-8qno-3sn1-qp2r-75rnn6qs4930@fhfr.qr> (raw)
In-Reply-To: <20211111125851.GC17431@kam.mff.cuni.cz>

On Thu, 11 Nov 2021, Jan Hubicka wrote:

> Hi,
> This patch enables optimization of stores that are killed by calls.
> Modref summary is extended by array containing list of access ranges, relative
> to function parameters, that are known to be killed by the function.
> This array is collected during local analysis and optimized (so separate
> stores are glued together). 
> 
> Kill analysis in ipa-modref.c is quite simplistic.  In particular no WPA
> propagation is done and also we take very simple approach to prove that
> given store is executed each invocation of the function.  I simply
> require it to be in the first basic block and before anyting that can
> throw externally.  I have more fancy code for that but with this patch I
> want to primarily discuss interace to tree-ssa-alias.c. I wonder if thre
> are some helpers I can re-use?
> 
> From GCC linktime I get 814 functions with non-empty kill vector.
> 
> Modref stats:
>   modref kill: 39 kills, 7162 queries
>   modref use: 25169 disambiguations, 697722 queries
>   modref clobber: 2290122 disambiguations, 22750147 queries
>   5240008 tbaa queries (0.230329 per modref query)
>   806190 base compares (0.035437 per modref query)
> 
> (note that more kills happens at early optimization where we did not
> inlined that much yet).
> 
> For tramp3d (non-lto -O3 build):
> 
> Modref stats:
>   modref kill: 45 kills, 630 queries
>   modref use: 750 disambiguations, 10061 queries
>   modref clobber: 35253 disambiguations, 543262 queries
>   85347 tbaa queries (0.157101 per modref query)
>   18727 base compares (0.034471 per modref query)
> 
> So it is not that high, but it gets better after improving the analysis side
> and also with -Os and/or PGO (wehre we offline cdtors) and also wiring in
> same_addr_size_stores_p which I want to discuss incrementally.
> 
> But at least there are not that many queries to slow down compile times
> noticeably :)
> 
> Honza
> 
> gcc/ChangeLog:
> 
> 	* ipa-modref-tree.h (struct modref_access_node): New member function
> 	* ipa-modref.c (modref_summary::useful_p): Kills are not useful when
> 	we can not analyze loads.
> 	(struct modref_summary_lto): Add kills.
> 	(modref_summary::dump): Dump kills.
> 	(record_access): Take access node as parameter.
> 	(record_access_lto): Likewise.
> 	(add_kill): New function.
> 	(merge_call_side_effects): Merge kills.
> 	(analyze_call): Pass around always_executed.
> 	(struct summary_ptrs): Add always_executed flag.
> 	(analyze_load): Update.
> 	(analyze_store): Handle kills.
> 	(analyze_stmt): Pass around always_executed flag; handle kills from
> 	clobbers.
> 	(analyze_function): Compute always_executed.
> 	(modref_summaries::duplicate): Copy kills.
> 	(update_signature): Release kills.
> 	* ipa-modref.h (struct modref_summary): Add kills.
> 	* tree-ssa-alias.c (dump_alias_stats): Dump kills.
> 	(stmt_kills_ref_p): Handle modref kills.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.dg/tree-ssa/modref-dse-2.c: New test.
> 
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 17ff6bb582c..6f8caa331a6 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -120,6 +120,8 @@ static struct {
>    unsigned HOST_WIDE_INT modref_use_no_alias;
>    unsigned HOST_WIDE_INT modref_clobber_may_alias;
>    unsigned HOST_WIDE_INT modref_clobber_no_alias;
> +  unsigned HOST_WIDE_INT modref_kill_no;
> +  unsigned HOST_WIDE_INT modref_kill_yes;
>    unsigned HOST_WIDE_INT modref_tests;
>    unsigned HOST_WIDE_INT modref_baseptr_tests;
>  } alias_stats;
> @@ -169,6 +171,12 @@ dump_alias_stats (FILE *s)
>  	   + alias_stats.aliasing_component_refs_p_may_alias);
>    dump_alias_stats_in_alias_c (s);
>    fprintf (s, "\nModref stats:\n");
> +  fprintf (s, "  modref kill: "
> +	   HOST_WIDE_INT_PRINT_DEC" kills, "
> +	   HOST_WIDE_INT_PRINT_DEC" queries\n",
> +	   alias_stats.modref_kill_yes,
> +	   alias_stats.modref_kill_yes
> +	   + alias_stats.modref_kill_no);
>    fprintf (s, "  modref use: "
>  	   HOST_WIDE_INT_PRINT_DEC" disambiguations, "
>  	   HOST_WIDE_INT_PRINT_DEC" queries\n",
> @@ -3373,6 +3381,107 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
>    if (is_gimple_call (stmt))
>      {
>        tree callee = gimple_call_fndecl (stmt);
> +      struct cgraph_node *node;
> +      modref_summary *summary;
> +
> +      /* Try to disambiguate using modref summary.  Modref records a vector
> +	 of stores with known offsets relative to function parameters that must
> +	 happen every execution of function.  Find if we have a matching
> +	 store and verify that function can not use the value.  */
> +      if (callee != NULL_TREE
> +	  && (node = cgraph_node::get (callee)) != NULL
> +	  && node->binds_to_current_def_p ()
> +	  && (summary = get_modref_function_summary (node)) != NULL

I wonder why we bother producing summaries for things that do not
bind locally?  The summary->kills.length () has an upper bound?

> +	  && summary->kills.length ())
> +	{
> +	  tree base = ao_ref_base (ref);
> +	  for (unsigned int i = 0; i < summary->kills.length (); i++)
> +	    {
> +	      modref_access_node &a = summary->kills[i];
> +	      tree op;
> +	      poly_offset_int off1_adj = 0, off2_adj = 0;
> +	      poly_int64 off1, off2;
> +	      tree base_ptr = NULL;
> +	      tree base_decl = NULL;
> +
> +	      if (a.parm_index >= 0)
> +		op = gimple_call_arg (stmt, a.parm_index);
> +	      else if (a.parm_index == MODREF_STATIC_CHAIN_PARM)
> +		op = gimple_call_chain (stmt);
> +	      else
> +		gcc_unreachable ();

I wonder if we can abstract this to a modref_access_node method?

> +
> +	      off2_adj += a.parm_offset * BITS_PER_UNIT;

wasn't there a parm_offset unknown? ...

> +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> +		continue;
> +	      if (TREE_CODE (base) == MEM_REF)
> +		{
> +		  off1_adj = mem_ref_offset (base) << LOG2_BITS_PER_UNIT;
> +		  if (TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
> +		    base_decl = TREE_OPERAND (TREE_OPERAND (base, 0), 0);

'base' will be the decl in this case, apart from when the constant
offset doesn't fit ao_ref.offset, so I think you can spare this
special-case and give up on non-SSA base_ptr

> +		  else
> +		    base_ptr = TREE_OPERAND (base, 0);
> +		}
> +	      /* Give up on TMRs for now.  */
> +	      else if (TREE_CODE (base) == TARGET_MEM_REF)
> +		break;
> +	      else
> +		base_decl = base;
> +
> +	      gcc_checking_assert (!base_decl || DECL_P (base_decl));
> +	      gcc_checking_assert (!base_ptr
> +				   || TREE_CODE (base_ptr) == SSA_NAME);
> +
> +	      /* OP is a pointer and we have access range from its
> +		 dereference.  */
> +	      if (TREE_CODE (op) == ADDR_EXPR)
> +		{
> +		  poly_int64 size, offset, max_size;
> +		  bool reverse;
> +		  tree op_base = get_ref_base_and_extent
> +			  (TREE_OPERAND (op, 0), &offset, &size,
> +			   &max_size, &reverse);

I think you want get_addr_base_and_unit_offset here.  All
variable indexed addresses are in separate stmts.  That also means
you can eventually work with just byte sizes/offsets?

> +		  if (!known_size_p (size) || !known_eq (size, max_size))
> +		    continue;
> +		  off2_adj += offset;
> +		  /* &str->foo are not passed as gimple operands directly,
> +		     would need to look up the def stmt.  */
> +		  gcc_checking_assert (TREE_CODE (op_base) != MEM_REF);
> +		  if (!base_decl
> +		      || compare_base_decls (op_base, base_decl) != 1)
> +		    continue;
> +		}
> +	      else if (!base_ptr || !operand_equal_p (base_ptr, op))
> +		continue;
> +
> +	      if (!(off1_adj + ref->offset).to_shwi (&off1))
> +		continue;
> +	      if (!(off2_adj + a.offset).to_shwi (&off2))
> +		continue;
> +
> +	      if (known_subrange_p (off1, ref->max_size, off2, a.size)
> +		  && dbg_cnt (ipa_mod_ref))
> +		{
> +		  /* For store to be killed it needs to not be used earlier.  */
> +		  if (ref_maybe_used_by_call_p_1 (as_a <gcall *> (stmt), ref,
> +						  true))

Hmm, so moderf says p->x is killed when we have

foo (struct X *p)
{
  int tem = p->x;
  p->x = 0;
  return tem;
}

?  Or even

foo (struct X *p)
{
  bar ();
  p->x = 0;
}

?

Otherwise it looks sensible.

Richard.

> +		    break;
> +		  if (dump_file && (dump_flags & TDF_DETAILS))
> +		    {
> +		      fprintf (dump_file,
> +			       "ipa-modref: call stmt ");
> +		      print_gimple_stmt (dump_file, stmt, 0);
> +		      fprintf (dump_file,
> +			       "ipa-modref: call to %s kills ",
> +			       node->dump_name ());
> +		      print_generic_expr (dump_file, ref->base);
> +		    }
> +		  ++alias_stats.modref_kill_yes;
> +		  return true;
> +		}
> +	    }
> +	  ++alias_stats.modref_kill_no;
> +	}
>        if (callee != NULL_TREE
>  	  && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
>  	switch (DECL_FUNCTION_CODE (callee))
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index 3e213b23d79..dc3b910801d 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -287,6 +287,55 @@ struct GTY(()) modref_access_node
>  	      size, max_size, record_adjustments);
>        return true;
>      }
> +  /* Merge in access A if it is possible to do without losing
> +     precision.  Return true if successful.
> +     Unlike merge assume that both accesses are always executed
> +     and merge size the same was as max_size.  */
> +  bool merge_for_kills (const modref_access_node &a)
> +    {
> +      poly_int64 offset1 = 0;
> +      poly_int64 aoffset1 = 0;
> +      poly_int64 new_parm_offset = 0;
> +
> +      /* We assume that containment was tested earlier.  */
> +      gcc_checking_assert (!contains (a) && !a.contains (*this));
> +      /* For kills we allways need to know parameter.  */
> +      gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
> +			   && a.parm_index != MODREF_UNKNOWN_PARM);
> +      if (parm_index != a.parm_index)
> +	return false;
> +      /* Unknown offsets are useless.  */
> +      gcc_checking_assert (parm_offset_known && a.parm_offset_known);
> +      if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
> +	return false;
> +      /* Ranges must be known and sizes matching max sizes.  */
> +      gcc_checking_assert (range_info_useful_p ()
> +			  && a.range_info_useful_p ()
> +			  && known_size_p (size) && known_eq (size, max_size)
> +			  && known_size_p (a.size)
> +			  && known_eq (a.size, a.max_size));
> +      if (known_le (offset1, aoffset1))
> +	{
> +	  if (!known_size_p (max_size)
> +	      || known_ge (offset1 + max_size, aoffset1))
> +	    {
> +	      update_for_kills (new_parm_offset, offset1, max_size,
> +				aoffset1, a.max_size);
> +	      return true;
> +	    }
> +	}
> +      else if (known_le (aoffset1, offset1))
> +	{
> +	  if (!known_size_p (a.max_size)
> +	      || known_ge (aoffset1 + a.max_size, offset1))
> +	    {
> +	      update_for_kills (new_parm_offset, offset1, max_size,
> +				aoffset1, a.max_size);
> +	      return true;
> +	    }
> +	}
> +      return false;
> +    }
>    /* Return true if A1 and B1 can be merged with lower informatoin
>       less than A2 and B2.
>       Assume that no containment or lossless merging is possible.  */
> @@ -430,6 +479,33 @@ private:
>        update (parm_offset1, offset1,
>  	      new_size, new_max_size, record_adjustments);
>      }
> +  /* Merge two ranges both starting at parm_offset1 and update THIS
> +     with result.  */
> +  void update_for_kills (poly_int64 parm_offset1,
> +			 poly_int64 offset1,
> +			 poly_int64 max_size1,
> +			 poly_int64 offset2,
> +			 poly_int64 max_size2)
> +    {
> +      if (known_le (offset1, offset2))
> +	;
> +      else if (known_le (offset2, offset1))
> +	{
> +	  std::swap (offset1, offset2);
> +	  std::swap (max_size1, max_size2);
> +	}
> +      else
> +	gcc_unreachable ();
> +
> +      poly_int64 new_max_size = max_size2 + offset2 - offset1;
> +      if (known_le (new_max_size, max_size1))
> +	new_max_size = max_size1;
> +
> +      parm_offset = parm_offset1;
> +      offset = offset1;
> +      max_size = new_max_size;
> +      size = new_max_size;
> +    }
>    /* Given access nodes THIS and A, return true if they
>       can be done with common parm_offsets.  In this case
>       return parm offset in new_parm_offset, new_offset
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index f8b7b900527..d9859daa18e 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -334,6 +334,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>      return false;
>    if (loads && !loads->every_base)
>      return true;
> +  else
> +    kills.release ();
>    if (ecf_flags & ECF_PURE)
>      return false;
>    return stores && !stores->every_base;
> @@ -370,6 +372,7 @@ struct GTY(()) modref_summary_lto
>       more verbose and thus more likely to hit the limits.  */
>    modref_records_lto *loads;
>    modref_records_lto *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> @@ -617,6 +620,12 @@ modref_summary::dump (FILE *out)
>        fprintf (out, "  stores:\n");
>        dump_records (stores, out);
>      }
> +  if (kills.length ())
> +    {
> +      fprintf (out, "  kills:\n");
> +      for (unsigned int i = 0; i < kills.length (); i++)
> +	dump_access (&kills[i], out);
> +    }
>    if (writes_errno)
>      fprintf (out, "  Writes errno\n");
>    if (side_effects)
> @@ -750,13 +759,12 @@ get_access (ao_ref *ref)
>  /* Record access into the modref_records data structure.  */
>  
>  static void
> -record_access (modref_records *tt, ao_ref *ref)
> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>  {
>    alias_set_type base_set = !flag_strict_aliasing ? 0
>  			    : ao_ref_base_alias_set (ref);
>    alias_set_type ref_set = !flag_strict_aliasing ? 0
>  			    : (ao_ref_alias_set (ref));
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>         fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
> @@ -769,7 +777,7 @@ record_access (modref_records *tt, ao_ref *ref)
>  /* IPA version of record_access_tree.  */
>  
>  static void
> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
>  {
>    /* get_alias_set sometimes use different type to compute the alias set
>       than TREE_TYPE (base).  Do same adjustments.  */
> @@ -816,7 +824,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>  		       || variably_modified_type_p (ref_type, NULL_TREE)))
>  	ref_type = NULL_TREE;
>      }
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>        fprintf (dump_file, "   - Recording base type:");
> @@ -910,6 +917,82 @@ parm_map_for_arg (tree op)
>    return parm_map;
>  }
>  
> +/* Insert new kill A into KILLS.  */
> +
> +static void
> +add_kill (vec<modref_access_node> &kills, modref_access_node &a)
> +{
> +  size_t index;
> +  modref_access_node *a2;
> +  bool merge = false;
> +
> +  /* See if we have corresponding etry already or we can merge with
> +     neighbouring entry.  */
> +  FOR_EACH_VEC_ELT (kills, index, a2)
> +    {
> +      if (a2->contains (a))
> +	return;
> +      if (a.contains (*a2))
> +	{
> +	  *a2 = a;
> +	  merge = true;
> +	  break;
> +	}
> +      if (a2->merge_for_kills (a))
> +	{
> +	  merge = true;
> +	  break;
> +	}
> +    }
> +  /* If entry was not found, insert it.  */
> +  if (!merge)
> +    {
> +      if ((int)kills.length () >= param_modref_max_accesses)
> +	{
> +	  if (dump_file)
> +	    fprintf (dump_file,
> +		     "--param param=modref-max-accesses limit reached:");
> +	  return;
> +	}
> +      kills.safe_push (a);
> +      return;
> +    }
> +  /* Extenidng range in an entry may make it possible to merge it with
> +     other entries.  */
> +    {
> +      size_t i;
> +
> +      for (i = 0; i < kills.length ();)
> +	if (i != index)
> +	  {
> +	    bool found = false, restart = false;
> +	    modref_access_node *a = &kills[i];
> +	    modref_access_node *n = &kills[index];
> +
> +	    if (n->contains (*a))
> +	      found = true;
> +	    if (!found && n->merge_for_kills (*a))
> +	      found = restart = true;
> +	    gcc_checking_assert (found || !a->merge_for_kills (*n));
> +	    if (found)
> +	      {
> +		kills.unordered_remove (i);
> +		if (index == kills.length ())
> +		  {
> +		    index = i;
> +		    i++;
> +		  }
> +		if (restart)
> +		  i = 0;
> +	      }
> +	    else
> +	      i++;
> +	  }
> +	else
> +	  i++;
> +    }
> +}
> +
>  /* Merge side effects of call STMT to function with CALLEE_SUMMARY
>     int CUR_SUMMARY.  Return true if something changed.
>     If IGNORE_STORES is true, do not merge stores.
> @@ -920,7 +1003,7 @@ bool
>  merge_call_side_effects (modref_summary *cur_summary,
>  			 gimple *stmt, modref_summary *callee_summary,
>  			 bool ignore_stores, cgraph_node *callee_node,
> -			 bool record_adjustments)
> +			 bool record_adjustments, bool always_executed)
>  {
>    auto_vec <modref_parm_map, 32> parm_map;
>    modref_parm_map chain_map;
> @@ -988,6 +1071,29 @@ merge_call_side_effects (modref_summary *cur_summary,
>  	  changed = true;
>  	}
>      }
> +  if (always_executed)
> +    {
> +      size_t i;
> +      modref_access_node *a2;
> +      FOR_EACH_VEC_ELT (callee_summary->kills, i, a2)
> +	{
> +	  modref_access_node n = *a2;
> +	  if (n.parm_index >= (int)parm_map.length ())
> +	    continue;
> +	  modref_parm_map &m
> +		  = n.parm_index == MODREF_STATIC_CHAIN_PARM
> +		    ? chain_map
> +		    : parm_map[n.parm_index];
> +	  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
> +	      || m.parm_index == MODREF_UNKNOWN_PARM
> +	      || m.parm_index == MODREF_RETSLOT_PARM
> +	      || !m.parm_offset_known)
> +	    continue;
> +	  n.parm_index = m.parm_index;
> +	  n.parm_offset += m.parm_offset;
> +	  add_kill (cur_summary->kills, n);
> +	}
> +    }
>    if (!cur_summary->side_effects
>        && callee_summary->side_effects)
>      {
> @@ -1198,7 +1304,8 @@ process_fnspec (modref_summary *cur_summary,
>  
>  static bool
>  analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
> -	      gcall *stmt, vec <gimple *> *recursive_calls)
> +	      gcall *stmt, vec <gimple *> *recursive_calls,
> +	      bool always_executed)
>  {
>    /* Check flags on the function call.  In certain cases, analysis can be
>       simplified.  */
> @@ -1284,7 +1391,7 @@ analyze_call (modref_summary *cur_summary, modref_summary_lto *cur_summary_lto,
>      }
>  
>    merge_call_side_effects (cur_summary, stmt, callee_summary, ignore_stores,
> -			   callee_node, false);
> +			   callee_node, false, always_executed);
>  
>    return true;
>  }
> @@ -1295,6 +1402,7 @@ struct summary_ptrs
>  {
>    struct modref_summary *nolto;
>    struct modref_summary_lto *lto;
> +  bool always_executed;
>  };
>  
>  /* Helper for analyze_stmt.  */
> @@ -1329,11 +1437,12 @@ analyze_load (gimple *, tree, tree op, void *data)
>  
>    ao_ref r;
>    ao_ref_init (&r, op);
> +  modref_access_node a = get_access (&r);
>  
>    if (summary)
> -    record_access (summary->loads, &r);
> +    record_access (summary->loads, &r, a);
>    if (summary_lto)
> -    record_access_lto (summary_lto->loads, &r);
> +    record_access_lto (summary_lto->loads, &r, a);
>    return false;
>  }
>  
> @@ -1369,11 +1478,24 @@ analyze_store (gimple *, tree, tree op, void *data)
>  
>    ao_ref r;
>    ao_ref_init (&r, op);
> +  modref_access_node a = get_access (&r);
>  
>    if (summary)
> -    record_access (summary->stores, &r);
> +    record_access (summary->stores, &r, a);
>    if (summary_lto)
> -    record_access_lto (summary_lto->stores, &r);
> +    record_access_lto (summary_lto->stores, &r, a);
> +  if (summary
> +      && ((summary_ptrs *)data)->always_executed
> +      && a.parm_offset_known == true
> +      && a.parm_index != MODREF_UNKNOWN_PARM
> +      && a.parm_index != MODREF_RETSLOT_PARM
> +      && known_size_p (r.size)
> +      && known_eq (r.max_size, r.size))
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, "   - Recording kill\n");
> +      add_kill (summary->kills, a);
> +    }
>    return false;
>  }
>  
> @@ -1382,16 +1504,36 @@ analyze_store (gimple *, tree, tree op, void *data)
>  
>  static bool
>  analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
> -	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls)
> +	      gimple *stmt, bool ipa, vec <gimple *> *recursive_calls,
> +	      bool always_executed)
>  {
>    /* In general we can not ignore clobbers because they are barriers for code
>       motion, however after inlining it is safe to do because local optimization
>       passes do not consider clobbers from other functions.
>       Similar logic is in ipa-pure-const.c.  */
>    if ((ipa || cfun->after_inlining) && gimple_clobber_p (stmt))
> -    return true;
> +    {
> +      if (summary
> +	  && always_executed && record_access_p (gimple_assign_lhs (stmt)))
> +	{
> +	  ao_ref r;
> +	  ao_ref_init (&r, gimple_assign_lhs (stmt));
> +	  modref_access_node a = get_access (&r);
> +	  if (a.parm_offset_known == true
> +	      && a.parm_index != MODREF_UNKNOWN_PARM
> +	      && a.parm_index != MODREF_RETSLOT_PARM
> +	      && known_size_p (r.size)
> +	      && known_eq (r.max_size, r.size))
> +	    {
> +	      if (dump_file)
> +		fprintf (dump_file, "   - Recording kill\n");
> +	      add_kill (summary->kills, a);
> +	    }
> +	}
> +      return true;
> +    }
>  
> -  struct summary_ptrs sums = {summary, summary_lto};
> +  struct summary_ptrs sums = {summary, summary_lto, always_executed};
>  
>    /* Analyze all loads and stores in STMT.  */
>    walk_stmt_load_store_ops (stmt, &sums,
> @@ -1420,7 +1562,8 @@ analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
>     case GIMPLE_CALL:
>       if (!ipa || gimple_call_internal_p (stmt))
>         return analyze_call (summary, summary_lto,
> -			    as_a <gcall *> (stmt), recursive_calls);
> +			    as_a <gcall *> (stmt), recursive_calls,
> +			    always_executed);
>       else
>        {
>  	attr_fnspec fnspec = gimple_call_fnspec (as_a <gcall *>(stmt));
> @@ -2729,11 +2872,15 @@ analyze_function (function *f, bool ipa)
>    FOR_EACH_BB_FN (bb, f)
>      {
>        gimple_stmt_iterator si;
> +      bool always_executed
> +	      = bb == single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun))->dest;
> +
>        for (si = gsi_start_nondebug_after_labels_bb (bb);
>  	   !gsi_end_p (si); gsi_next_nondebug (&si))
>  	{
>  	  if (!analyze_stmt (summary, summary_lto,
> -			     gsi_stmt (si), ipa, &recursive_calls)
> +			     gsi_stmt (si), ipa, &recursive_calls,
> +			     always_executed)
>  	      || ((!summary || !summary->useful_p (ecf_flags, false))
>  		  && (!summary_lto
>  		      || !summary_lto->useful_p (ecf_flags, false))))
> @@ -2742,6 +2889,9 @@ analyze_function (function *f, bool ipa)
>  	      collapse_stores (summary, summary_lto);
>  	      break;
>  	    }
> +	  if (always_executed
> +	      && stmt_can_throw_external (cfun, gsi_stmt (si)))
> +	    always_executed = false;
>  	}
>      }
>  
> @@ -2761,7 +2911,7 @@ analyze_function (function *f, bool ipa)
>  			   ignore_stores_p (current_function_decl,
>  					    gimple_call_flags
>  						 (recursive_calls[i])),
> -			   fnode, !first);
> +			   fnode, !first, false);
>  	      if (!summary->useful_p (ecf_flags, false))
>  		{
>  		  remove_summary (lto, nolto, ipa);
> @@ -2993,6 +3143,8 @@ modref_summaries::duplicate (cgraph_node *, cgraph_node *dst,
>  			 src_data->loads->max_refs,
>  			 src_data->loads->max_accesses);
>    dst_data->loads->copy_from (src_data->loads);
> +  dst_data->kills.reserve_exact (src_data->kills.length ());
> +  dst_data->kills.splice (src_data->kills);
>    dst_data->writes_errno = src_data->writes_errno;
>    dst_data->side_effects = src_data->side_effects;
>    if (src_data->arg_flags.length ())
> @@ -3640,6 +3792,8 @@ update_signature (struct cgraph_node *node)
>      {
>        r->loads->remap_params (&map);
>        r->stores->remap_params (&map);
> +      /* TODO: One we do IPA kills analysis, update the table here.  */
> +      r->kills.release ();
>        if (r->arg_flags.length ())
>  	remap_arg_flags (r->arg_flags, info);
>      }
> @@ -3647,6 +3801,8 @@ update_signature (struct cgraph_node *node)
>      {
>        r_lto->loads->remap_params (&map);
>        r_lto->stores->remap_params (&map);
> +      /* TODO: One we do IPA kills analysis, update the table here.  */
> +      r_lto->kills.release ();
>        if (r_lto->arg_flags.length ())
>  	remap_arg_flags (r_lto->arg_flags, info);
>      }
> diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
> index 49c99f263a7..b77c1aa7400 100644
> --- a/gcc/ipa-modref.h
> +++ b/gcc/ipa-modref.h
> @@ -30,6 +30,7 @@ struct GTY(()) modref_summary
>    /* Load and stores in function (transitively closed to all callees)  */
>    modref_records *loads;
>    modref_records *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
> new file mode 100644
> index 00000000000..2a23a9e8ccb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-dse1"  } */
> +__attribute__ ((noinline))
> +void write (int *a)
> +{
> +	*a=1;
> +	a[1]=2;
> +}
> +int test ()
> +{
> +	int a;
> +	a=2;
> +	write (&a);
> +	return a;
> +}
> +int test2 (int *a)
> +{
> +	*a=2;
> +	write (a);
> +	return *a;
> +}
> +/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1"} } */
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

next prev parent reply	other threads:[~2021-11-12 10:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 12:58 Jan Hubicka
2021-11-12 10:38 ` Richard Biener [this message]
2021-11-12 10:47   ` Jan Hubicka
2021-11-12 11:00     ` Richard Biener
2021-11-14 18:53     ` Jan Hubicka
2021-11-15 18:51       ` H.J. Lu
2021-11-15 19:00         ` Jeff Law
2021-11-15 20:56           ` Jan Hubicka
2021-11-16  8:19         ` Jan Hubicka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=o5r62o97-8qno-3sn1-qp2r-75rnn6qs4930@fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@kam.mff.cuni.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).