From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C7F3C3858D33; Mon, 26 Jun 2023 18:19:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C7F3C3858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687803579; bh=vmvmciGIVW1Q3Nty/evGzZLCTQHsHAvyifn/Nc60q34=; h=From:To:Subject:Date:In-Reply-To:References:From; b=eEOvgAwGfIdxvmXnLjPcgRKs86DGisKS1Md+dGUzM28JS0/KXVTashtanumsLRERc oCuHSCHY9rQhtUrJDv7IiV/sRjH1nbQacc/zKYXv19Oor5ZGNisSKIe78hurO9H065 ethBPZWj/5A/3gc7U34vNbm6OB5/Kn+Og4vNBdz4= From: "amonakov at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/110237] gcc.dg/torture/pr58955-2.c is miscompiled by RTL scheduling after reload Date: Mon, 26 Jun 2023 18:19:39 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: amonakov at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110237 --- Comment #18 from Alexander Monakov --- (In reply to rguenther@suse.de from comment #17) > Yes, we do the same to loads. I hope that's not a common technique > though but I have to admit the vectorizer itself assesses whether it's > safe to access "gaps" by looking at alignment so its code generation > is prone to this same "mistake". >=20 > Now, is "alignment to 16 is ensured externally" good enough here? > If we consider >=20 > static int a[2]; >=20 > and code doing >=20 > if (is_aligned (a)) > { > __v4si v =3D (__attribute__((may_alias)) __v4si *) &a; > } >=20 > then we cannot even use a DECL_ALIGN that's insufficient for decls > that bind locally. I agree. I went with the 'extern' example because there it should be more obvious the construction ought to work. > Note we have similar arguments with aggregate type sizes (and TBAA) > where when we infer a dynamic type from one access we check if > the other access would fit. Wouldn't the above then extend to that > as well given we could also do aggregate copies of "padding" and > ignore the bits if we'd have ensured the larger access wouldn't trap? I think a read via a may_alias type just tells you that N bytes are accessi= ble for reading, not necessarily for writing. So I don't see a problem, but may= be I didn't quite catch what you are saying. > So supporting the above might be a bit of a stretch (though I think > we have to fix the vectorizer here). What would the solution be? Using a may_alias type for such accesses? > > > If the v4si store is masked we cannot do this anymore, but the IL > > > we seed the alias oracle with doesn't know the store is partial. > > > The only way to "fix" it is to take away all of the information from = it. > >=20 > > But that won't fix the trapping issue? I think we need a distinct RTX f= or > > memory accesses where hardware does fault suppression for masked-out el= ements. >=20 > Yes, it doesn't fix that part. The idea of using BLKmode instead of > a vector mode for the MEMs would, I guess, together with specifying > MEM_SIZE as not known. Unfortunate if that works for the trapping side, but not for the aliasing s= ide.=