From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id C7F3C3858D33; Mon, 26 Jun 2023 18:19:39 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C7F3C3858D33
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1687803579;
	bh=vmvmciGIVW1Q3Nty/evGzZLCTQHsHAvyifn/Nc60q34=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=eEOvgAwGfIdxvmXnLjPcgRKs86DGisKS1Md+dGUzM28JS0/KXVTashtanumsLRERc
	 oCuHSCHY9rQhtUrJDv7IiV/sRjH1nbQacc/zKYXv19Oor5ZGNisSKIe78hurO9H065
	 ethBPZWj/5A/3gc7U34vNbm6OB5/Kn+Og4vNBdz4=
From: "amonakov at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/110237] gcc.dg/torture/pr58955-2.c is
 miscompiled by RTL scheduling after reload
Date: Mon, 26 Jun 2023 18:19:39 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: amonakov at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-110237-4-C0EmBSRxUQ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-110237-4@http.gcc.gnu.org/bugzilla/>
References: <bug-110237-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110237
--- Comment #18 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #17)
> Yes, we do the same to loads.  I hope that's not a common technique
> though but I have to admit the vectorizer itself assesses whether it's
> safe to access "gaps" by looking at alignment so its code generation
> is prone to this same "mistake".
>=20
> Now, is "alignment to 16 is ensured externally" good enough here?
> If we consider
>=20
> static int a[2];
>=20
> and code doing
>=20
>  if (is_aligned (a))
>    {
>      __v4si v =3D (__attribute__((may_alias)) __v4si *) &a;
>    }
>=20
> then we cannot even use a DECL_ALIGN that's insufficient for decls
> that bind locally.

I agree. I went with the 'extern' example because there it should be more
obvious the construction ought to work.


> Note we have similar arguments with aggregate type sizes (and TBAA)
> where when we infer a dynamic type from one access we check if
> the other access would fit.  Wouldn't the above then extend to that
> as well given we could also do aggregate copies of "padding" and
> ignore the bits if we'd have ensured the larger access wouldn't trap?

I think a read via a may_alias type just tells you that N bytes are accessi=
ble
for reading, not necessarily for writing. So I don't see a problem, but may=
be I
didn't quite catch what you are saying.


> So supporting the above might be a bit of a stretch (though I think
> we have to fix the vectorizer here).

What would the solution be? Using a may_alias type for such accesses?


> > > If the v4si store is masked we cannot do this anymore, but the IL
> > > we seed the alias oracle with doesn't know the store is partial.
> > > The only way to "fix" it is to take away all of the information from =
it.
> >=20
> > But that won't fix the trapping issue? I think we need a distinct RTX f=
or
> > memory accesses where hardware does fault suppression for masked-out el=
ements.
>=20
> Yes, it doesn't fix that part.  The idea of using BLKmode instead of
> a vector mode for the MEMs would, I guess, together with specifying
> MEM_SIZE as not known.

Unfortunate if that works for the trapping side, but not for the aliasing s=
ide.=