From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C63413858D35; Fri, 10 Dec 2021 09:03:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C63413858D35 From: "husseydevin at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/103641] New: [aarch64][11 regression] Severe compile time regression in SLP vectorize step Date: Fri, 10 Dec 2021 09:03:15 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 11.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: husseydevin at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2021 09:03:15 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103641 Bug ID: 103641 Summary: [aarch64][11 regression] Severe compile time regression in SLP vectorize step Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: husseydevin at gmail dot com Target Milestone: --- Created attachment 51966 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D51966&action=3Dedit aarch64-linux-gnu-gcc-11 -O3 -c xxhash.c -ftime-report -ftime-report-details While GCC 11.2 has been noticably better at NEON64 code, with some files it hangs for more than 15-30 seconds on the SLP vectorization step. I haven't narrowed this down to a specific thing yet because I don't know m= uch about the GCC internals, but it is *extremely* noticeable in the xxHash library. (https://github.com/Cyan4973/xxHash). This is a test compiling xxhash.c from Git revision a17161efb1d2de151857277628678b0e0b486155. This was done on a Core i5-430m with 8GB RAM and an SSD on Debian Bullseye amd64. GCC 10 (10.2.1-6) was from the\repos, GCC 11 (11.2.0) was built from= the tarball with similar flags. While this may cause bias, the two compilers get very similar times when the SLP vectorizer is off. $ time aarch64-linux-gnu-gcc-10 -O3 -c xxhash.c real 0m3.596s user 0m3.270s sys 0m0.149s $ time aarch64-linux-gnu-gcc-11 -O3 -c xxhash.c real 0m31.579s user 0m31.314s sys 0m0.112s When disabling the NEON intrinsics with `-DXXH_VECTOR=3D0`, it only takes ~= 21 seconds.=20 Time variable usr sys = wall GGC phase opt and generate : 31.46 ( 97%) 0.24 ( 32%) 31.80 ( = 96%) 54M ( 63%) callgraph functions expansion : 31.01 ( 96%) 0.18 ( 24%) 31.29 ( = 94%) 42M ( 49%) tree slp vectorization : 28.35 ( 88%) 0.03 ( 4%) 28.37 ( = 85%) 9941k ( 11%) TOTAL : 32.34 0.75 33.20=20= =20=20=20=20=20=20 86M This is significantly worse on my Pi 4B, where an ARMv7->AArch64 build took= 3 minutes, although I presume that is mostly due to being 32-bit and the CPU being much slower.=