public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,  Jeff Law <law@redhat.com>
Subject: Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]
Date: Wed, 19 Feb 2020 12:19:00 -0000	[thread overview]
Message-ID: <mptmu9ekbvz.fsf@arm.com> (raw)
In-Reply-To: <mptblqawd9g.fsf@arm.com> (Richard Sandiford's message of "Fri,	07 Feb 2020 16:52:43 +0000")

What should we do about this?  The PR is a wrong-code regression from
GCC 9 and it doesn't look like we can remove the second offset_overlap_p
check, given how many cases currently rely on it.

Thanks,
Richard

Richard Sandiford <richard.sandiford@arm.com> writes:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Feb 4, 2020 at 6:44 PM Richard Sandiford
>> <richard.sandiford@arm.com> wrote:
>>>
>>> Richard Sandiford <richard.sandiford@arm.com> writes:
>>> > [...]
>>> >> I'm not sure given the issues you've introduced if I could actually
>>> >> fill out the matrix of answers without more underlying information.
>>> >> ie, when can we get symbols without source level decls,
>>> >> anchors+interposition issues, etc.
>>> >
>>> > OK.  In that case, I wonder whether it would be safer to have a
>>> > fourth state on top of the three above:
>>> >
>>> >   - known distance apart
>>> >   - independent
>>> >   - known distance apart or independent
>>> >   - don't know
>>> >
>>> > with "don't know" being anything that involves bare symbols?
>>> >
>>> > Richard
>>>
>>> How does this look?  Tested on aarch64-linux-gnu and
>>> x86_64-linux-gnu.
>>>
>>> Full description from scratch:
>>>
>>> memrefs_conflict_p has a slightly odd structure.  It first checks
>>> whether two addresses based on SYMBOL_REFs refer to the same object,
>>> with a tristate result:
>>>
>>>       int cmp = compare_base_symbol_refs (x,y);
>>>
>>> If the addresses do refer to the same object, we can use offset-based checks:
>>>
>>>       /* If both decls are the same, decide by offsets.  */
>>>       if (cmp == 1)
>>>         return offset_overlap_p (c, xsize, ysize);
>>>
>>> But then, apart from the special case of forced address alignment,
>>> we use an offset-based check even if we don't know whether the
>>> addresses refer to the same object:
>>>
>>>       /* Assume a potential overlap for symbolic addresses that went
>>>          through alignment adjustments (i.e., that have negative
>>>          sizes), because we can't know how far they are from each
>>>          other.  */
>>>       if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>>>         return -1;
>>>       /* If decls are different or we know by offsets that there is no overlap,
>>>          we win.  */
>>>       if (!cmp || !offset_overlap_p (c, xsize, ysize))
>>>         return 0;
>>>
>>> This somewhat contradicts:
>>>
>>>   /* In general we assume that memory locations pointed to by different labels
>>>      may overlap in undefined ways.  */
>>
>> Sorry for not chiming in earlier but isn't the bug that
>>
>>
>>
>>> at the end of compare_base_symbol_refs.  In other words, we're taking -1
>>> to mean that either (a) the symbols are equal (via aliasing) or (b) the
>>> references access non-overlapping objects.
>>>
>>> But even assuming that's true for normal symbols, it doesn't cope
>>> correctly with section anchors.  If a symbol X at ANCHOR+OFFSET
>>> is preemptible, either (a) X = ANDHOR+OFFSET or (b) X and ANCHOR
>>> reference non-overlapping objects.
>>>
>>> And an offset-based comparison makes no sense for an anchor symbol
>>> vs. a bare symbol with no decl.  If the bare symbol is allowed to
>>> alias other symbols then it can surely alias any symbol in the
>>> anchor's block, so there are multiple anchor offsets that might
>>> induce an alias.
>>
>> But then isn't the simple fix to honor the -1 and do
>>
>> diff --git a/gcc/alias.c b/gcc/alias.c
>> index 3794f9b6a9e..bf13d37c0f7 100644
>> --- a/gcc/alias.c
>> +++ b/gcc/alias.c
>> @@ -2490,9 +2490,8 @@ memrefs_conflict_p (poly_int64 xsize, rtx x,
>> poly_int64 ysize, rtx y,
>>          other.  */
>>        if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>>         return -1;
>> -      /* If decls are different or we know by offsets that there is no overlap,
>> -        we win.  */
>> -      if (!cmp || !offset_overlap_p (c, xsize, ysize))
>> +      /* If decls are different, we win.  */
>> +      if (cmp == 0)
>>         return 0;
>>        /* Decls may or may not be different and offsets overlap....*/
>>        return -1;
>>
>> ?
>
> The code was deliberately written this way from the ouset though
> (rather than it ending up like this through many cuts).  It was
> added in g:54363f8a92920f5559c83ddd53e480a27205e6b7:
>
> 2015-12-08  Jan Hubicka  <hubicka@ucw.cz>
>
> 	PR ipa/61886
> 	PR middle-end/25140
> 	* tree-ssa-alias.c (ptr_deref_may_alias_decl_p): Use compare_base_decls
> 	(nonoverlapping_component_refs_of_decl_p): Update sanity check.
> 	(decl_refs_may_alias_p): Use compare_base_decls.
> 	* alias.c: Include cgraph.h
> 	(get_alias_set): Add cut-off for recursion.
> 	(rtx_equal_for_memref_p): Use rtx_equal_for_memref_p.
> 	(compare_base_decls): New function.
> 	(base_alias_check): Likewise.
> 	(memrefs_conflict_p): Likewise.
> 	(nonoverlapping_memrefs_p): Likewise.
> 	* alias.h (compare_base_decls): Declare.
>
> which included:
>
> -  if (rtx_equal_for_memref_p (x, y))
> +  if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
> +    {
> +      tree x_decl = SYMBOL_REF_DECL (x);
> +      tree y_decl = SYMBOL_REF_DECL (y);
> +      int cmp;
> +
> +      if (!x_decl || !y_decl)
> +	{
> +	  /* Label and normal symbol are never the same. */
> +	  if (x_decl != y_decl)
> +	    return 0;
> +	  return offset_overlap_p (c, xsize, ysize);
> +	}
> +      else
> +        cmp = compare_base_decls (x_decl, y_decl);
> +
> +      /* If both decls are the same, decide by offsets.  */
> +      if (cmp == 1)
> +        return offset_overlap_p (c, xsize, ysize);
> +      /* If decls are different or we know by offsets that there is no overlap,
> +	 we win.  */
> +      if (!cmp || !offset_overlap_p (c, xsize, ysize))
> +	return 0;
> +      /* Decls may or may not be different and offsets overlap....*/
> +      return -1;
> +    }
> +  else if (rtx_equal_for_memref_p (x, y))
>
> The only significant change since them was to add compare_base_symbol_refs
> (g:73e48cb322152bf504ced8694fa748544ecaa6eb):
>
>    if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
>      {
> -      tree x_decl = SYMBOL_REF_DECL (x);
> -      tree y_decl = SYMBOL_REF_DECL (y);
> -      int cmp;
> -
> -      if (!x_decl || !y_decl)
> -	{
> -	  /* Label and normal symbol are never the same. */
> -	  if (x_decl != y_decl)
> -	    return 0;
> -	  return offset_overlap_p (c, xsize, ysize);
> -	}
> -      else
> -        cmp = compare_base_decls (x_decl, y_decl);
> +      int cmp = compare_base_symbol_refs (x,y);
>
>>> This patch therefore replaces the current tristate:
>>>
>>>   - known equal
>>>   - known independent (two accesses can't alias)
>>>   - equal or independent
>>>
>>> with:
>>>
>>>   - known distance apart
>>>   - known independent (two accesses can't alias)
>>>   - known distance apart or independent
>>>   - don't know
>>>
>>> For safety, the patch puts all bare symbols in the "don't know"
>>> category.  If that turns out to be too conservative, we at least
>>> need that behaviour for combinations involving a bare symbol
>>> and a section anchor.
>>
>> This sounds complicated (for this stage).  Do you have any statistics as
>> to how it affects actual alias queries (thus outcome in {true,...}_dependence)
>> when you do the "simple" fix?
>
> For a stage3 build of gcc on aarch64-linux-gnu I get:
>
>    ("positive" == conflict, "negative" == no conflict)
>
>    16191  cmp ==  1,    disjoint offsets :  true negative
>    19698  cmp ==  1, overlapping offsets :  true positive
>    79363  cmp == -1,    disjoint offsets :  true negative
>     6965  cmp == -1,    disjoint offsets : false negative <== bug
>       48  cmp == -1, overlapping offsets :  true positive
>      928  cmp == -1, overlapping offsets : false positive <== missed opt
>   123193  total queries
>
> where the cmp == -1 cases are divided into true and false results
> according to whether we get the same answer when taking the (hopefully)
> correct offset into account.  cmp == 0 and what would be cmp == -2 never
> occured.
>
> So it looks like we're relying on the cmp == -1 offset_overlap_p
> check to get true "no conflict" results for ~64% of all queries
> (or ~83% of all true "no conflict" results).  I expect this is
> very dependent on the fact that the port uses section anchors.
>
> The number of buggy cases seems surprisingly high, but I've tried
> to re-check it a couple of times and it looks to be accurate.
> They come from cases where we compare things like:
>
> x: (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
> y: (symbol_ref:DI ("_ZL23yy_last_accepting_state") [flags 0x82] <var_decl ... yy_last_accepting_state>)
> c: -48
>
> where yy_last_accepting_state is at offset 48 from the anchor.
> I haven't checked where the buggy queries are coming from though;
> might be debug-related and so hard to spot.
>
> Thanks,
> Richard

  reply	other threads:[~2020-02-19 12:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mpth80jj31f.fsf@arm.com>
     [not found] ` <3f05acb37a6edc7d3aa211e55d855ed6d93f03c4.camel@redhat.com>
     [not found]   ` <mpt36br5yvt.fsf@arm.com>
2020-02-04 17:44     ` Richard Sandiford
2020-02-05  8:10       ` Richard Biener
2020-02-07 16:52         ` Richard Sandiford
2020-02-19 12:19           ` Richard Sandiford [this message]
2020-02-19 14:20             ` Richard Biener
2020-02-19 14:22               ` Richard Biener
2020-02-19 15:12               ` Richard Sandiford
2020-02-20  8:49                 ` Richard Biener
2021-01-12 11:24 Richard Sandiford
2021-01-17 23:01 ` Jan Hubicka
2021-01-18  8:26   ` Richard Biener
2021-01-18 12:00     ` Richard Sandiford
2021-01-18 13:29       ` Jan Hubicka
2021-01-18 14:17         ` Richard Sandiford
2021-01-19 16:03           ` Richard Biener
2021-01-19 16:15             ` Jan Hubicka
2021-01-19 17:51               ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptmu9ekbvz.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).