public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: Jeff Law <jeffreyalaw@gmail.com>,
	 Robin Dapp <rdapp.gcc@gmail.com>,
	 gcc-patches@gcc.gnu.org,  "Li\, Pan2" <pan2.li@intel.com>
Subject: Re: [PATCH][RFC] middle-end/110237 - wrong MEM_ATTRs for partial loads/stores
Date: Tue, 28 Nov 2023 11:32:57 +0000	[thread overview]
Message-ID: <mptv89mru3a.fsf@arm.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2311281114500.21409@jbgna.fhfr.qr> (Richard Biener's message of "Tue, 28 Nov 2023 11:21:21 +0000 (UTC)")

Richard Biener <rguenther@suse.de> writes:
> On Tue, 28 Nov 2023, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > On Mon, 27 Nov 2023, Jeff Law wrote:
>> >
>> >> 
>> >> 
>> >> On 11/27/23 05:39, Robin Dapp wrote:
>> >> >> The easiest way to avoid running into the alias analysis problem is
>> >> >> to scrap the MEM_EXPR when we expand the internal functions for
>> >> >> partial loads/stores.  That avoids the disambiguation we run into
>> >> >> which is realizing that we store to an object of less size as
>> >> >> the size of the mode we appear to store.
>> >> >>
>> >> >> After the patch we see just
>> >> >>
>> >> >>    [1  S64 A32]
>> >> >>
>> >> >> so we preserve the alias set, the alignment and the size (the size
>> >> >> is redundant if the MEM insn't BLKmode).  That's still not good
>> >> >> in case the RTL alias oracle would implement the same
>> >> >> disambiguation but it fends off the gimple one.
>> >> >>
>> >> >> This fixes gcc.dg/torture/pr58955-2.c when built with AVX512
>> >> >> and --param=vect-partial-vector-usage=1.
>> >> > 
>> >> > On riscv we're seeing a similar problem across the testsuite
>> >> > and several execution failures as a result.  In the case I
>> >> > looked at we move a scalar load upwards over a partial store
>> >> > that aliases the load.
>> >> > 
>> >> > I independently arrived at the spot mentioned in
>> >> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110237#c4
>> >> > before knowing about the PR.
>> >> > 
>> >> > I can confirm that your RFC patch fixes at least two of the
>> >> > failures,  I haven't checked the others but very likely
>> >> > they are similar.
>> >> FWIW, it should always be safe to ignore the memory attributes.   So if
>> >> there's a reasonable condition here, then we can use it and just ignore the
>> >> attribute.
>> >> 
>> >> Does the attribute on a partial load/store indicate the potential load/store
>> >> size or does it indicate the actual known load/store size. If the former, then
>> >> we probably need to treat it as a may-read/may-write kind of reference.
>> >
>> > There's no way to distinguish a partial vs. non-partial MEM on RTL and
>> > while without the bogus MEM_ATTR the alias oracle pieces that
>> > miscompiled the original case are fended off we still see the load/store
>> > as full given they have a mode with a size - that for example means
>> > that DSE can elide a previous store to a masked part.  Eventually
>> > that's fended off by using an UNSPEC, but whether the RTL IL has
>> > the correct semantics is questionable.
>> >
>> > That said, I did propose scrapping the MEM_EXPR which I think is
>> > the correct thing to do unless we want to put a CALL_EXPR into it
>> > (nothing would use that at the moment) or re-do MEM_EXPR and instead
>> > have an ao_ref (or sth slightly more complete) instead of the current
>> > MEM_ATTRs - but that would be a lot of work.
>> >
>> > This leaves the question wrt. semantics of for example x86 mask_store:
>> >
>> > (insn 23 22 24 5 (set (mem:V4DF (plus:DI (reg/v/f:DI 106 [ x ])
>> >                 (reg:DI 101 [ ivtmp.15 ])) [2 MEM <vector(4) double> 
>> > [(double *)x_11(D) + ivtmp.15_33 * 1]+0 S32 A64])
>> >         (unspec:V4DF [
>> >                 (reg:V4DI 104 [ mask__16.8 ])
>> >                 (reg:V4DF 105 [ vect_cst__42 ])
>> >                 (mem:V4DF (plus:DI (reg/v/f:DI 106 [ x ])
>> >                         (reg:DI 101 [ ivtmp.15 ])) [2 MEM <vector(4) 
>> > double> [(double *)x_11(D) + ivtmp.15_33 * 1]+0 S32 A64])
>> >             ] UNSPEC_MASKMOV)) "t.c":5:12 8523 {avx_maskstorepd256}
>> >      (nil))
>> >
>> > it uses a read-modify-write which makes it safe for DSE.  mask_load
>> > looks like
>> >
>> > (insn 28 27 29 6 (set (reg:V4DF 115 [ vect__7.11 ])
>> >         (unspec:V4DF [
>> >                 (reg:V4DI 114 [ mask__8.8 ])
>> >                 (mem:V4DF (plus:DI (reg/v/f:DI 118 [ val ])
>> >                         (reg:DI 103 [ ivtmp.29 ])) [2 MEM <vector(4) 
>> > double> [(double *)val_13(D) + ivtmp.29_22 * 1]+0 S32 A64])
>> >             ] UNSPEC_MASKMOV)) "t.c":5:17 8515 {avx_maskloadpd256}
>> >      (nil))
>> >
>> > both have (as operand of the UNSPEC) a MEM with V4DFmode (and a
>> > MEM_EXPR with a similarly bougs MEM_EXPR) indicating the loads
>> > are _not_ partial.  That means the disambiguation against a store
>> > to an object that's smaller than V4DF is still possible.
>> > Setting MEM_SIZE to UNKNOWN doesn't help - that just asks to look
>> > at the mode.  As discussed using a BLKmode MEM _might_ be a way
>> > out but I didn't try what will happen then (patterns would need to
>> > be adjusted I guess).
>> >
>> > That said, I'm happy to commit the partial fix, scrapping the
>> > bogus MEM_EXPRs.
>> >
>> > OK for that?
>> 
>> Could we restrict it to cases where the offset+size might overstep the
>> bounds?  I.e. do the equivalent of:
>> 
>>       /* Drop the object if the new right end is not within its bounds.  */
>>       if (adjust_object && maybe_gt (offset + size, attrs.size))
>> 	{
>> 	  attrs.expr = NULL_TREE;
>> 	  attrs.alias = 0;
>> 	}
>> 
>> (from adjust_address_1, although it might be difficult to reuse that
>> code in this context).
>
> Well, the MEM_EXPR would still not correctly represent the performed
> access.  But yes, if the base decl is visible and the offset is
> all constant we might be able to prove the access is not out of bounds.
> But that doesn't seem to be the most likely case?

Agree it's not the most likely case.  Just thought it was still a
distinction worth making, since it's one we already make elsewhere.

It would also make the condition more self-documenting, and make it
less likely that we're using this to cover up other problems.

> Note we have lost restrict info already when producing the
> .MASK_LOAD/STORE and TBAA info is also preserved without the
> MEM_EXPR.  The MEM_EXPR itself adds points-to info only
> (besides the wrong access size).  With ao_ref we also don't
> have a way to carry only points-to info we also have to
> build a fake access (which then also carries restrict info).

Understood.  It's just that one valid use of masked accesses
is to optimise something like:

int a[1024];
...
  for (int i = 0; i < 1024; ++i)
    a[i] = foo ? bar : a[i];
...

so that no load takes place and the store is masked.  In that context
it seems like the access really is accurately described as a RMW access,
and points-to info can be used in the same way as it would be for an
unmasked load and an unmasked store.  It would be good to retain the
same level of disambiguation, however limited that is in practice.

Thanks,
Richard

  reply	other threads:[~2023-11-28 11:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230621075019.7CA813858033@sourceware.org>
2023-11-27 12:39 ` Robin Dapp
2023-11-27 15:45   ` Jeff Law
2023-11-28  7:50     ` Richard Biener
2023-11-28 10:31       ` Richard Sandiford
2023-11-28 11:21         ` Richard Biener
2023-11-28 11:32           ` Richard Sandiford [this message]
2023-11-28 12:17             ` Richard Biener
2023-11-28 15:00       ` Jeff Law
2023-11-29  7:16         ` Richard Biener
     [not found] <20230621074956.1174B3858288@sourceware.org>
2023-06-26  8:29 ` Hongtao Liu
2023-06-26  8:41   ` Richard Biener
     [not found] <20230621074951.F3C3C3858433@sourceware.org>
2023-06-21 15:29 ` Jeff Law
2023-06-22  6:39   ` Richard Biener
2023-06-24 14:32     ` Jeff Law
2023-06-21  7:49 Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptv89mru3a.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=pan2.li@intel.com \
    --cc=rdapp.gcc@gmail.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).