From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 8007E3858410; Thu, 28 Oct 2021 13:09:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8007E3858410 From: "hubicka at kam dot mff.cuni.cz" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2 Date: Thu, 28 Oct 2021 13:09:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: hubicka at kam dot mff.cuni.cz X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Oct 2021 13:09:30 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101908 --- Comment #10 from hubicka at kam dot mff.cuni.cz --- > | b =3D 2.0 * ray.dir.x * (ray.orig.x - sph->pos.x) +=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20 > # > | movupd (%rdi),%xmm5=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 > # > | 2.0 * ray.dir.y * (ray.orig.y - sph->pos.y) +=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > # > | 2.0 * ray.dir.z * (ray.orig.z - sph->pos.z);=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > # > 0.02 | movsd 0x10(%rdi),%xmm9=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20 > # > 0.01 | movupd 0xb8(%rsp),%xmm13=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20 > # > 37.67 | movupd 0xa0(%rsp),%xmm15=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 >=20 > so we pass struct ray on the stack(?) and perform SSE loads from it but > the argument passing does >=20 > 0.88 | movups %xmm2,(%rsp)=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20 > # > 0.22 | movups %xmm3,0x10(%rsp)=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 > # > 43.81 | movups %xmm4,0x20(%rsp)=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 > # > 0.66 | call ray_sphere=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20 Adding Martin to CC. I think we could teach ipa-sra to, with -flto, turn the structure either to scalar arguments or to be passed by reference which would allow us to hoist its initialization out of the loop body. Honza=