From: Jeff Law <jeffreyalaw@gmail.com>
To: "H.J. Lu" <hjl.tools@gmail.com>, Jan Hubicka <hubicka@kam.mff.cuni.cz>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Richard Biener <rguenther@suse.de>
Subject: Re: Basic kill analysis for modref
Date: Mon, 15 Nov 2021 12:00:13 -0700 [thread overview]
Message-ID: <4eb689bc-e5b7-0ef9-72b0-1084d8700dcd@gmail.com> (raw)
In-Reply-To: <CAMe9rOo7Z56jGJXWsnY+iis96U8k7ZEHikEctdOiFNkChrY75w@mail.gmail.com>
On 11/15/2021 11:51 AM, H.J. Lu via Gcc-patches wrote:
> On Sun, Nov 14, 2021 at 10:53 AM Jan Hubicka via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>>> I think you want get_addr_base_and_unit_offset here. All
>>>> variable indexed addresses are in separate stmts. That also means
>>>> you can eventually work with just byte sizes/offsets?
>>> Will do. The access range in modref summary is bit based (since we want
>>> to disabiguate bitfields like we do in rest of alias oracle) but indeed
>>> this part cna be in bytes.
>> Actually after the unifiation I can just use get_ao_ref which will call
>> ao_ref_init_from_ptr_and_range that has all the logic I need in there.
>> I also noticed that I ended up duplicating the code matching bases
>> and ranges which is done already twice in the function - once for store
>> targets and once for MEMSET and friends. The later copy lacked overflow
>> checks so I took the first copy and moved it to helper function. This
>> makes the gimple part of patch really straighforward: just build ao_ref
>> if possible and then pass it to this function.
>>
>> I also added statistics.
>>
>> I have bootstrapped/regtsed on x86_64-linux the updated patch and
>> comitted it so I can break out the patches that depends on it.
>> I have patch improving the kill tracking at modref side and also the
>> kill oracle itself can use fnspec and does not need to special case
>> mem* functions.
>>
>> For cc1plus LTO link I now get:
>>
>> Alias oracle query stats:
>> refs_may_alias_p: 76106130 disambiguations, 100928932 queries
>> on_includes: 12539931 disambiguations, 39864841 queries
>> ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
>> call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
>> stmt_kills_ref_p: 107503 kills, 5699589 queries
>> nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
>> nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must overlaps, 96698 queries
>> aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
>> TBAA oracle: 28248334 disambiguations 104710521 queries
>> 15220245 are in alias set 0
>> 8905994 queries asked about the same object
>> 98 queries asked about the same alias set
>> 0 access volatile
>> 50371110 are dependent in the DAG
>> 1964740 are aritificially in conflict with void *
>>
>> Modref stats:
>> modref kill: 52 kills, 6655 queries
>> modref use: 25204 disambiguations, 692151 queries
>> modref clobber: 2309709 disambiguations, 21877806 queries
>> 5320532 tbaa queries (0.243193 per modref query)
>> 761785 base compares (0.034820 per modref query)
>>
>> PTA query stats:
>> pt_solution_includes: 12539931 disambiguations, 39864841 queries
>> pt_solutions_intersect: 1713075 disambiguations, 14023484 queries
>>
>> Newly we get statis of kill oracle itself:
>> stmt_kills_ref_p: 107503 kills, 5699589 queries
>> and the modref part:
>> modref kill: 52 kills, 6655 queries
>> So an improvemnet over 1 kill using modref I had before. Still not
>> really great.
>>
>> Honza
>>
>> gcc/ChangeLog:
>>
>> * ipa-modref-tree.c (modref_access_node::update_for_kills): New
>> member function.
>> (modref_access_node::merge_for_kills): Likewise.
>> (modref_access_node::insert_kill): Likewise.
>> * ipa-modref-tree.h (modref_access_node::update_for_kills,
>> modref_access_node::merge_for_kills, modref_access_node::insert_kill):
>> Declare.
>> (modref_access_node::useful_for_kill): New member function.
>> * ipa-modref.c (modref_summary::useful_p): Release useless kills.
>> (lto_modref_summary): Add kills.
>> (modref_summary::dump): Dump kills.
>> (record_access): Add mdoref_access_node parameter.
>> (record_access_lto): Likewise.
>> (merge_call_side_effects): Merge kills.
>> (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
>> (struct summary_ptrs): Add always_executed filed.
>> (analyze_load): Update.
>> (analyze_store): Update; record kills.
>> (analyze_stmt): Add always_executed; record kills in clobbers.
>> (analyze_function): Track always_executed.
>> (modref_summaries::duplicate): Duplicate kills.
>> (update_signature): Release kills.
>> * ipa-modref.h (struct modref_summary): Add kills.
>> * tree-ssa-alias.c (alias_stats): Add kill stats.
>> (dump_alias_stats): Dump kill stats.
>> (store_kills_ref_p): Break out from ...
>> (stmt_kills_ref_p): Use it; handle modref info based kills.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2021-11-14 Jan Hubicka <hubicka@ucw.cz>
>>
>> * gcc.dg/tree-ssa/modref-dse-3.c: New test.
>>
>>
>> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
>> index 6fc2b7298f4..bbe23a5a211 100644
>> --- a/gcc/ipa-modref-tree.c
>> +++ b/gcc/ipa-modref-tree.c
>> @@ -638,6 +638,185 @@ modref_access_node::get_ao_ref (const gcall *stmt, ao_ref *ref) const
>> return true;
>> }
>>
>> +/* Return true A is a subkill. */
>> +bool
>> +modref_access_node::contains_for_kills (const modref_access_node &a) const
>> +{
>> + poly_int64 aoffset_adj = 0;
>> +
>> + gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
>> + && a.parm_index != MODREF_UNKNOWN_PARM);
>> + if (parm_index != a.parm_index)
>> + return false;
>> + gcc_checking_assert (parm_offset_known && a.parm_offset_known);
>> + aoffset_adj = (a.parm_offset - parm_offset)
>> + * BITS_PER_UNIT;
>> + gcc_checking_assert (range_info_useful_p () && a.range_info_useful_p ());
>> + return known_subrange_p (a.offset + aoffset_adj,
>> + a.max_size, offset, max_size);
>> +}
>> +
>> +/* Merge two ranges both starting at parm_offset1 and update THIS
>> + with result. */
>> +bool
>> +modref_access_node::update_for_kills (poly_int64 parm_offset1,
>> + poly_int64 offset1,
>> + poly_int64 max_size1,
>> + poly_int64 offset2,
>> + poly_int64 max_size2,
>> + bool record_adjustments)
>> +{
>> + if (known_le (offset1, offset2))
>> + ;
>> + else if (known_le (offset2, offset1))
>> + {
>> + std::swap (offset1, offset2);
>> + std::swap (max_size1, max_size2);
>> + }
>> + else
>> + gcc_unreachable ();
>> +
>> + poly_int64 new_max_size = max_size2 + offset2 - offset1;
>> + if (known_le (new_max_size, max_size1))
>> + new_max_size = max_size1;
>> + if (known_eq (parm_offset, parm_offset1)
>> + && known_eq (offset, offset1)
>> + && known_eq (size, new_max_size)
>> + && known_eq (max_size, new_max_size))
>> + return false;
>> +
>> + if (!record_adjustments
>> + || (++adjustments) < param_modref_max_adjustments)
>> + {
>> + parm_offset = parm_offset1;
>> + offset = offset1;
>> + max_size = new_max_size;
>> + size = new_max_size;
>> + gcc_checking_assert (useful_for_kill_p ());
>> + return true;
>> + }
>> + return false;
>> +}
>> +
>> +/* Merge in access A if it is possible to do without losing
>> + precision. Return true if successful.
>> + Unlike merge assume that both accesses are always executed
>> + and merge size the same was as max_size. */
>> +bool
>> +modref_access_node::merge_for_kills (const modref_access_node &a,
>> + bool record_adjustments)
>> +{
>> + poly_int64 offset1 = 0;
>> + poly_int64 aoffset1 = 0;
>> + poly_int64 new_parm_offset = 0;
>> +
>> + /* We assume that containment was tested earlier. */
>> + gcc_checking_assert (!contains_for_kills (a) && !a.contains_for_kills (*this)
>> + && useful_for_kill_p () && a.useful_for_kill_p ());
>> +
>> + if (parm_index != a.parm_index
>> + || !combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
>> + return false;
>> +
>> + if (known_le (offset1, aoffset1))
>> + {
>> + if (!known_size_p (max_size)
>> + || known_ge (offset1 + max_size, aoffset1))
>> + return update_for_kills (new_parm_offset, offset1, max_size,
>> + aoffset1, a.max_size, record_adjustments);
>> + }
>> + else if (known_le (aoffset1, offset1))
>> + {
>> + if (!known_size_p (a.max_size)
>> + || known_ge (aoffset1 + a.max_size, offset1))
>> + return update_for_kills (new_parm_offset, offset1, max_size,
>> + aoffset1, a.max_size, record_adjustments);
>> + }
>> + return false;
>> +}
>> +
>> +/* Insert new kill A into KILLS. If RECORD_ADJUSTMENTS is true limit number
>> + of changes to each entry. Return true if something changed. */
>> +
>> +bool
>> +modref_access_node::insert_kill (vec<modref_access_node> &kills,
>> + modref_access_node &a, bool record_adjustments)
>> +{
>> + size_t index;
>> + modref_access_node *a2;
>> + bool merge = false;
>> +
>> + gcc_checking_assert (a.useful_for_kill_p ());
>> +
>> + /* See if we have corresponding entry already or we can merge with
>> + neighbouring entry. */
>> + FOR_EACH_VEC_ELT (kills, index, a2)
>> + {
>> + if (a2->contains_for_kills (a))
>> + return false;
>> + if (a.contains_for_kills (*a2))
>> + {
>> + a.adjustments = 0;
>> + *a2 = a;
>> + merge = true;
>> + break;
>> + }
>> + if (a2->merge_for_kills (a, record_adjustments))
>> + {
>> + merge = true;
>> + break;
>> + }
>> + }
>> + /* If entry was not found, insert it. */
>> + if (!merge)
>> + {
>> + if ((int)kills.length () >= param_modref_max_accesses)
>> + {
>> + if (dump_file)
>> + fprintf (dump_file,
>> + "--param param=modref-max-accesses limit reached:");
>> + return false;
>> + }
>> + a.adjustments = 0;
>> + kills.safe_push (a);
>> + return true;
>> + }
>> + /* Extending range in an entry may make it possible to merge it with
>> + other entries. */
>> + size_t i;
>> +
>> + for (i = 0; i < kills.length ();)
>> + if (i != index)
>> + {
>> + bool found = false, restart = false;
>> + modref_access_node *a = &kills[i];
>> + modref_access_node *n = &kills[index];
>> +
>> + if (n->contains_for_kills (*a))
>> + found = true;
>> + if (!found && n->merge_for_kills (*a, false))
>> + found = restart = true;
>> + gcc_checking_assert (found || !a->merge_for_kills (*n, false));
>> + if (found)
>> + {
>> + kills.unordered_remove (i);
>> + if (index == kills.length ())
>> + {
>> + index = i;
>> + i++;
>> + }
>> + if (restart)
>> + i = 0;
>> + }
>> + else
>> + i++;
>> + }
>> + else
>> + i++;
>> + return true;
>> +}
>> +
>> +
>> #if CHECKING_P
>>
>> namespace selftest {
>> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
>> index 2fcabe480bd..1bf2aa8460e 100644
>> --- a/gcc/ipa-modref-tree.h
>> +++ b/gcc/ipa-modref-tree.h
>> @@ -82,6 +82,13 @@ struct GTY(()) modref_access_node
>> {
>> return parm_index != MODREF_UNKNOWN_PARM;
>> }
>> + /* Return true if access can be used to determine a kill. */
>> + bool useful_for_kill_p () const
>> + {
>> + return parm_offset_known && parm_index != MODREF_UNKNOWN_PARM
>> + && parm_index != MODREF_RETSLOT_PARM && known_size_p (size)
>> + && known_eq (max_size, size);
>> + }
>> /* Dump range to debug OUT. */
>> void dump (FILE *out);
>> /* Return true if both accesses are the same. */
>> @@ -98,10 +105,18 @@ struct GTY(()) modref_access_node
>> static int insert (vec <modref_access_node, va_gc> *&accesses,
>> modref_access_node a, size_t max_accesses,
>> bool record_adjustments);
>> + /* Same as insert but for kills where we are conservative the other way
>> + around: if information is lost, the kill is lost. */
>> + static bool insert_kill (vec<modref_access_node> &kills,
>> + modref_access_node &a, bool record_adjustments);
>> private:
>> bool contains (const modref_access_node &) const;
>> + bool contains_for_kills (const modref_access_node &) const;
>> void update (poly_int64, poly_int64, poly_int64, poly_int64, bool);
>> + bool update_for_kills (poly_int64, poly_int64, poly_int64,
>> + poly_int64, poly_int64, bool);
>> bool merge (const modref_access_node &, bool);
>> + bool merge_for_kills (const modref_access_node &, bool);
>> static bool closer_pair_p (const modref_access_node &,
>> const modref_access_node &,
>> const modref_access_node &,
>> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
>> index b75ed84135b..df4612bbff9 100644
>> --- a/gcc/ipa-modref.c
>> +++ b/gcc/ipa-modref.c
>> @@ -335,6 +335,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>> return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>> if (loads && !loads->every_base)
>> return true;
>> + else
>> + kills.release ();
>> if (ecf_flags & ECF_PURE)
>> return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>> return stores && !stores->every_base;
>> @@ -351,6 +353,7 @@ struct GTY(()) modref_summary_lto
>> more verbose and thus more likely to hit the limits. */
>> modref_records_lto *loads;
>> modref_records_lto *stores;
>> + auto_vec<modref_access_node> GTY((skip)) kills;
>> auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>> eaf_flags_t retslot_flags;
>> eaf_flags_t static_chain_flags;
>> @@ -570,6 +573,15 @@ modref_summary::dump (FILE *out)
>> fprintf (out, " stores:\n");
>> dump_records (stores, out);
>> }
>> + if (kills.length ())
>> + {
>> + fprintf (out, " kills:\n");
>> + for (auto kill : kills)
>> + {
>> + fprintf (out, " ");
>> + kill.dump (out);
>> + }
>> + }
>> if (writes_errno)
>> fprintf (out, " Writes errno\n");
>> if (side_effects)
>> @@ -762,13 +774,12 @@ get_access (ao_ref *ref)
>> /* Record access into the modref_records data structure. */
>>
>> static void
>> -record_access (modref_records *tt, ao_ref *ref)
>> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>> {
>> alias_set_type base_set = !flag_strict_aliasing ? 0
>> : ao_ref_base_alias_set (ref);
>> alias_set_type ref_set = !flag_strict_aliasing ? 0
>> : (ao_ref_alias_set (ref));
>> - modref_access_node a = get_access (ref);
>> if (dump_file)
>> {
>> fprintf (dump_file, " - Recording base_set=%i ref_set=%i ",
>> @@ -781,7 +792,7 @@ record_access (modref_records *tt, ao_ref *ref)
>> /* IPA version of record_access_tree. */
>>
>> static void
>> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
>> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
>> {
>> /* get_alias_set sometimes use different type to compute the alias set
>> than TREE_TYPE (base). Do same adjustments. */
>> @@ -828,7 +839,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>> || variably_modified_type_p (ref_type, NULL_TREE)))
>> ref_type = NULL_TREE;
>> }
>> - modref_access_node a = get_access (ref);
>> if (dump_file)
>> {
>> fprintf (dump_file, " - Recording base type:");
>> @@ -932,13 +942,17 @@ bool
>> merge_call_side_effects (modref_summary *cur_summary,
>> gimple *stmt, modref_summary *callee_summary,
>> bool ignore_stores, cgraph_node *callee_node,
>> - bool record_adjustments)
>> + bool record_adjustments, bool always_executed)
>> {
>> auto_vec <modref_parm_map, 32> parm_map;
>> modref_parm_map chain_map;
>> bool changed = false;
>> int flags = gimple_call_flags (stmt);
>>
>> + if ((flags & (ECF_CONST | ECF_NOVOPS))
>> + && !(flags & ECF_LOOPING_CONST_OR_PURE))
>> + return changed;
>> +
>> if (!cur_summary->side_effects && callee_summary->side_effects)
>> {
>> if (dump_file)
>> @@ -950,6 +964,38 @@ merge_call_side_effects (modref_summary *cur_summary,
>> if (flags & (ECF_CONST | ECF_NOVOPS))
>> return changed;
>>
>> + if (always_executed
>> + && callee_summary->kills.length ()
>> + && (!cfun->can_throw_non_call_exceptions
>> + || !stmt_could_throw_p (cfun, stmt)))
>> + {
>> + /* Watch for self recursive updates. */
>> + auto_vec<modref_access_node, 32> saved_kills;
>> +
>> + saved_kills.reserve_exact (callee_summary->kills.length ());
>> + saved_kills.splice (callee_summary->kills);
>> + for (auto kill : saved_kills)
>> + {
>> + if (kill.parm_index >= (int)parm_map.length ())
>> + continue;
>> + modref_parm_map &m
>> + = kill.parm_index == MODREF_STATIC_CHAIN_PARM
>> + ? chain_map
> ^^^^^^^^^^^^^^^^ chain_map isn't initialized.
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262
Yup. It's causing heartburn in various ways in the tester. I was just
tracking it down with valgrind...
jeff
next prev parent reply other threads:[~2021-11-15 19:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-11 12:58 Jan Hubicka
2021-11-12 10:38 ` Richard Biener
2021-11-12 10:47 ` Jan Hubicka
2021-11-12 11:00 ` Richard Biener
2021-11-14 18:53 ` Jan Hubicka
2021-11-15 18:51 ` H.J. Lu
2021-11-15 19:00 ` Jeff Law [this message]
2021-11-15 20:56 ` Jan Hubicka
2021-11-16 8:19 ` Jan Hubicka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4eb689bc-e5b7-0ef9-72b0-1084d8700dcd@gmail.com \
--to=jeffreyalaw@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=hubicka@kam.mff.cuni.cz \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).