From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 9E0693856259; Fri, 20 May 2022 09:12:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9E0693856259 From: "crazylht at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/105513] [9/10/11/12/13 Regression] Unnecessary SSE spill since r9-5748-g1d4b4f4979171ef0 Date: Fri, 20 May 2022 09:12:03 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.1.0 X-Bugzilla-Keywords: missed-optimization, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: crazylht at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 9.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2022 09:12:03 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105513 --- Comment #8 from Hongtao.liu --- (In reply to Alexander Monakov from comment #7) > The second sequence is 3 uops vs 1/2 (issued/executed) uops in first, and= on > Haswell and Skylake it ties up port 5 for two cycles. >=20 > Unclear if you're microbenchmarking latency or throughput, but in any case > on Haswell and Skylake you should see a close to 2x difference. I'm counting clocksticks, and thought a load may take more latency. #include #include #include #define LOOP 1000000000 typedef long v2di __attribute__((vector_size(16))); typedef int v4si __attribute__((vector_size(16))); v2di __attribute__ ((noipa)) foo (v2di a) { a[1] =3D 111113; return a; } void __attribute__ ((noipa)) foo1 (v2di a) { } int main () { int i; unsigned long long start, end; unsigned long long diff; unsigned int aux; start =3D __rdtscp (&aux); v2di b =3D __extension__ (v2di){111, 222}; for (i =3D 0; i < LOOP; i++) { v2di a =3D foo (b); foo1 (a); } end =3D __rdtscp (&aux); diff =3D end - start; printf ("alterna: %lld\n", diff); return 0; }=