From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, Jeff Law <law@redhat.com>
Subject: Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]
Date: Wed, 19 Feb 2020 12:19:00 -0000 [thread overview]
Message-ID: <mptmu9ekbvz.fsf@arm.com> (raw)
In-Reply-To: <mptblqawd9g.fsf@arm.com> (Richard Sandiford's message of "Fri, 07 Feb 2020 16:52:43 +0000")
What should we do about this? The PR is a wrong-code regression from
GCC 9 and it doesn't look like we can remove the second offset_overlap_p
check, given how many cases currently rely on it.
Thanks,
Richard
Richard Sandiford <richard.sandiford@arm.com> writes:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Feb 4, 2020 at 6:44 PM Richard Sandiford
>> <richard.sandiford@arm.com> wrote:
>>>
>>> Richard Sandiford <richard.sandiford@arm.com> writes:
>>> > [...]
>>> >> I'm not sure given the issues you've introduced if I could actually
>>> >> fill out the matrix of answers without more underlying information.
>>> >> ie, when can we get symbols without source level decls,
>>> >> anchors+interposition issues, etc.
>>> >
>>> > OK. In that case, I wonder whether it would be safer to have a
>>> > fourth state on top of the three above:
>>> >
>>> > - known distance apart
>>> > - independent
>>> > - known distance apart or independent
>>> > - don't know
>>> >
>>> > with "don't know" being anything that involves bare symbols?
>>> >
>>> > Richard
>>>
>>> How does this look? Tested on aarch64-linux-gnu and
>>> x86_64-linux-gnu.
>>>
>>> Full description from scratch:
>>>
>>> memrefs_conflict_p has a slightly odd structure. It first checks
>>> whether two addresses based on SYMBOL_REFs refer to the same object,
>>> with a tristate result:
>>>
>>> int cmp = compare_base_symbol_refs (x,y);
>>>
>>> If the addresses do refer to the same object, we can use offset-based checks:
>>>
>>> /* If both decls are the same, decide by offsets. */
>>> if (cmp == 1)
>>> return offset_overlap_p (c, xsize, ysize);
>>>
>>> But then, apart from the special case of forced address alignment,
>>> we use an offset-based check even if we don't know whether the
>>> addresses refer to the same object:
>>>
>>> /* Assume a potential overlap for symbolic addresses that went
>>> through alignment adjustments (i.e., that have negative
>>> sizes), because we can't know how far they are from each
>>> other. */
>>> if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>>> return -1;
>>> /* If decls are different or we know by offsets that there is no overlap,
>>> we win. */
>>> if (!cmp || !offset_overlap_p (c, xsize, ysize))
>>> return 0;
>>>
>>> This somewhat contradicts:
>>>
>>> /* In general we assume that memory locations pointed to by different labels
>>> may overlap in undefined ways. */
>>
>> Sorry for not chiming in earlier but isn't the bug that
>>
>>
>>
>>> at the end of compare_base_symbol_refs. In other words, we're taking -1
>>> to mean that either (a) the symbols are equal (via aliasing) or (b) the
>>> references access non-overlapping objects.
>>>
>>> But even assuming that's true for normal symbols, it doesn't cope
>>> correctly with section anchors. If a symbol X at ANCHOR+OFFSET
>>> is preemptible, either (a) X = ANDHOR+OFFSET or (b) X and ANCHOR
>>> reference non-overlapping objects.
>>>
>>> And an offset-based comparison makes no sense for an anchor symbol
>>> vs. a bare symbol with no decl. If the bare symbol is allowed to
>>> alias other symbols then it can surely alias any symbol in the
>>> anchor's block, so there are multiple anchor offsets that might
>>> induce an alias.
>>
>> But then isn't the simple fix to honor the -1 and do
>>
>> diff --git a/gcc/alias.c b/gcc/alias.c
>> index 3794f9b6a9e..bf13d37c0f7 100644
>> --- a/gcc/alias.c
>> +++ b/gcc/alias.c
>> @@ -2490,9 +2490,8 @@ memrefs_conflict_p (poly_int64 xsize, rtx x,
>> poly_int64 ysize, rtx y,
>> other. */
>> if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>> return -1;
>> - /* If decls are different or we know by offsets that there is no overlap,
>> - we win. */
>> - if (!cmp || !offset_overlap_p (c, xsize, ysize))
>> + /* If decls are different, we win. */
>> + if (cmp == 0)
>> return 0;
>> /* Decls may or may not be different and offsets overlap....*/
>> return -1;
>>
>> ?
>
> The code was deliberately written this way from the ouset though
> (rather than it ending up like this through many cuts). It was
> added in g:54363f8a92920f5559c83ddd53e480a27205e6b7:
>
> 2015-12-08 Jan Hubicka <hubicka@ucw.cz>
>
> PR ipa/61886
> PR middle-end/25140
> * tree-ssa-alias.c (ptr_deref_may_alias_decl_p): Use compare_base_decls
> (nonoverlapping_component_refs_of_decl_p): Update sanity check.
> (decl_refs_may_alias_p): Use compare_base_decls.
> * alias.c: Include cgraph.h
> (get_alias_set): Add cut-off for recursion.
> (rtx_equal_for_memref_p): Use rtx_equal_for_memref_p.
> (compare_base_decls): New function.
> (base_alias_check): Likewise.
> (memrefs_conflict_p): Likewise.
> (nonoverlapping_memrefs_p): Likewise.
> * alias.h (compare_base_decls): Declare.
>
> which included:
>
> - if (rtx_equal_for_memref_p (x, y))
> + if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
> + {
> + tree x_decl = SYMBOL_REF_DECL (x);
> + tree y_decl = SYMBOL_REF_DECL (y);
> + int cmp;
> +
> + if (!x_decl || !y_decl)
> + {
> + /* Label and normal symbol are never the same. */
> + if (x_decl != y_decl)
> + return 0;
> + return offset_overlap_p (c, xsize, ysize);
> + }
> + else
> + cmp = compare_base_decls (x_decl, y_decl);
> +
> + /* If both decls are the same, decide by offsets. */
> + if (cmp == 1)
> + return offset_overlap_p (c, xsize, ysize);
> + /* If decls are different or we know by offsets that there is no overlap,
> + we win. */
> + if (!cmp || !offset_overlap_p (c, xsize, ysize))
> + return 0;
> + /* Decls may or may not be different and offsets overlap....*/
> + return -1;
> + }
> + else if (rtx_equal_for_memref_p (x, y))
>
> The only significant change since them was to add compare_base_symbol_refs
> (g:73e48cb322152bf504ced8694fa748544ecaa6eb):
>
> if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
> {
> - tree x_decl = SYMBOL_REF_DECL (x);
> - tree y_decl = SYMBOL_REF_DECL (y);
> - int cmp;
> -
> - if (!x_decl || !y_decl)
> - {
> - /* Label and normal symbol are never the same. */
> - if (x_decl != y_decl)
> - return 0;
> - return offset_overlap_p (c, xsize, ysize);
> - }
> - else
> - cmp = compare_base_decls (x_decl, y_decl);
> + int cmp = compare_base_symbol_refs (x,y);
>
>>> This patch therefore replaces the current tristate:
>>>
>>> - known equal
>>> - known independent (two accesses can't alias)
>>> - equal or independent
>>>
>>> with:
>>>
>>> - known distance apart
>>> - known independent (two accesses can't alias)
>>> - known distance apart or independent
>>> - don't know
>>>
>>> For safety, the patch puts all bare symbols in the "don't know"
>>> category. If that turns out to be too conservative, we at least
>>> need that behaviour for combinations involving a bare symbol
>>> and a section anchor.
>>
>> This sounds complicated (for this stage). Do you have any statistics as
>> to how it affects actual alias queries (thus outcome in {true,...}_dependence)
>> when you do the "simple" fix?
>
> For a stage3 build of gcc on aarch64-linux-gnu I get:
>
> ("positive" == conflict, "negative" == no conflict)
>
> 16191 cmp == 1, disjoint offsets : true negative
> 19698 cmp == 1, overlapping offsets : true positive
> 79363 cmp == -1, disjoint offsets : true negative
> 6965 cmp == -1, disjoint offsets : false negative <== bug
> 48 cmp == -1, overlapping offsets : true positive
> 928 cmp == -1, overlapping offsets : false positive <== missed opt
> 123193 total queries
>
> where the cmp == -1 cases are divided into true and false results
> according to whether we get the same answer when taking the (hopefully)
> correct offset into account. cmp == 0 and what would be cmp == -2 never
> occured.
>
> So it looks like we're relying on the cmp == -1 offset_overlap_p
> check to get true "no conflict" results for ~64% of all queries
> (or ~83% of all true "no conflict" results). I expect this is
> very dependent on the fact that the port uses section anchors.
>
> The number of buggy cases seems surprisingly high, but I've tried
> to re-check it a couple of times and it looks to be accurate.
> They come from cases where we compare things like:
>
> x: (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
> y: (symbol_ref:DI ("_ZL23yy_last_accepting_state") [flags 0x82] <var_decl ... yy_last_accepting_state>)
> c: -48
>
> where yy_last_accepting_state is at offset 48 from the anchor.
> I haven't checked where the buggy queries are coming from though;
> might be debug-related and so hard to spot.
>
> Thanks,
> Richard
next prev parent reply other threads:[~2020-02-19 12:19 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <mpth80jj31f.fsf@arm.com>
[not found] ` <3f05acb37a6edc7d3aa211e55d855ed6d93f03c4.camel@redhat.com>
[not found] ` <mpt36br5yvt.fsf@arm.com>
2020-02-04 17:44 ` Richard Sandiford
2020-02-05 8:10 ` Richard Biener
2020-02-07 16:52 ` Richard Sandiford
2020-02-19 12:19 ` Richard Sandiford [this message]
2020-02-19 14:20 ` Richard Biener
2020-02-19 14:22 ` Richard Biener
2020-02-19 15:12 ` Richard Sandiford
2020-02-20 8:49 ` Richard Biener
2021-01-12 11:24 Richard Sandiford
2021-01-17 23:01 ` Jan Hubicka
2021-01-18 8:26 ` Richard Biener
2021-01-18 12:00 ` Richard Sandiford
2021-01-18 13:29 ` Jan Hubicka
2021-01-18 14:17 ` Richard Sandiford
2021-01-19 16:03 ` Richard Biener
2021-01-19 16:15 ` Jan Hubicka
2021-01-19 17:51 ` Richard Sandiford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptmu9ekbvz.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).