public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/103641] New: [aarch64][11 regression] Severe compile time regression in SLP vectorize step
@ 2021-12-10  9:03 husseydevin at gmail dot com
  2021-12-10  9:25 ` [Bug rtl-optimization/103641] " marxin at gcc dot gnu.org
                   ` (36 more replies)
  0 siblings, 37 replies; 38+ messages in thread
From: husseydevin at gmail dot com @ 2021-12-10  9:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103641

            Bug ID: 103641
           Summary: [aarch64][11 regression] Severe compile time
                    regression in SLP vectorize step
           Product: gcc
           Version: 11.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: husseydevin at gmail dot com
  Target Milestone: ---

Created attachment 51966
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51966&action=edit
aarch64-linux-gnu-gcc-11 -O3 -c xxhash.c -ftime-report -ftime-report-details

While GCC 11.2 has been noticably better at NEON64 code, with some files it
hangs for more than 15-30 seconds on the SLP vectorization step.

I haven't narrowed this down to a specific thing yet because I don't know much
about the GCC internals, but it is *extremely* noticeable in the xxHash
library. (https://github.com/Cyan4973/xxHash).

This is a test compiling xxhash.c from Git revision
a17161efb1d2de151857277628678b0e0b486155.

This was done on a Core i5-430m with 8GB RAM and an SSD on Debian Bullseye
amd64. GCC 10 (10.2.1-6) was from the\repos, GCC 11 (11.2.0) was built from the
tarball with similar flags. While this may cause bias, the two compilers get
very similar times when the SLP vectorizer is off.

$ time aarch64-linux-gnu-gcc-10 -O3 -c xxhash.c

real    0m3.596s
user    0m3.270s
sys     0m0.149s
$ time aarch64-linux-gnu-gcc-11 -O3 -c xxhash.c

real    0m31.579s
user    0m31.314s
sys     0m0.112s

When disabling the NEON intrinsics with `-DXXH_VECTOR=0`, it only takes ~21
seconds. 

Time variable                                   usr           sys          wall
          GGC
 phase opt and generate             :  31.46 ( 97%)   0.24 ( 32%)  31.80 ( 96%)
   54M ( 63%)
 callgraph functions expansion      :  31.01 ( 96%)   0.18 ( 24%)  31.29 ( 94%)
   42M ( 49%)
 tree slp vectorization             :  28.35 ( 88%)   0.03 (  4%)  28.37 ( 85%)
 9941k ( 11%)

 TOTAL                              :  32.34          0.75         33.20       
   86M

This is significantly worse on my Pi 4B, where an ARMv7->AArch64 build took 3
minutes, although I presume that is mostly due to being 32-bit and the CPU
being much slower.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2022-03-16  8:23 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-10  9:03 [Bug rtl-optimization/103641] New: [aarch64][11 regression] Severe compile time regression in SLP vectorize step husseydevin at gmail dot com
2021-12-10  9:25 ` [Bug rtl-optimization/103641] " marxin at gcc dot gnu.org
2021-12-10  9:37 ` [Bug tree-optimization/103641] " pinskia at gcc dot gnu.org
2021-12-10  9:43 ` [Bug tree-optimization/103641] [11/12 " pinskia at gcc dot gnu.org
2021-12-10  9:46 ` pinskia at gcc dot gnu.org
2021-12-10  9:56 ` pinskia at gcc dot gnu.org
2021-12-10 10:01 ` marxin at gcc dot gnu.org
2021-12-10 10:02 ` pinskia at gcc dot gnu.org
2021-12-10 10:03 ` pinskia at gcc dot gnu.org
2021-12-10 10:06 ` marxin at gcc dot gnu.org
2021-12-10 10:08 ` pinskia at gcc dot gnu.org
2021-12-10 10:09 ` pinskia at gcc dot gnu.org
2021-12-10 10:12 ` marxin at gcc dot gnu.org
2021-12-10 10:12 ` pinskia at gcc dot gnu.org
2021-12-10 10:14 ` pinskia at gcc dot gnu.org
2021-12-10 10:15 ` [Bug middle-end/103641] " pinskia at gcc dot gnu.org
2021-12-10 10:24 ` pinskia at gcc dot gnu.org
2021-12-10 10:28 ` pinskia at gcc dot gnu.org
2021-12-10 13:17 ` roger at nextmovesoftware dot com
2021-12-10 13:19 ` husseydevin at gmail dot com
2022-01-18 14:10 ` rguenth at gcc dot gnu.org
2022-01-22 14:30 ` roger at nextmovesoftware dot com
2022-01-24  8:13 ` rguenther at suse dot de
2022-01-24 16:49 ` roger at nextmovesoftware dot com
2022-01-24 17:02 ` roger at nextmovesoftware dot com
2022-01-25  7:23 ` rguenth at gcc dot gnu.org
2022-01-25  7:52 ` rguenth at gcc dot gnu.org
2022-02-04  7:26 ` rguenth at gcc dot gnu.org
2022-02-04 10:30 ` cvs-commit at gcc dot gnu.org
2022-02-04 10:43 ` rguenth at gcc dot gnu.org
2022-02-04 11:08 ` tnfchris at gcc dot gnu.org
2022-02-07 12:19 ` tnfchris at gcc dot gnu.org
2022-02-07 15:05 ` [Bug middle-end/103641] [11 " rguenth at gcc dot gnu.org
2022-02-08  8:08 ` tnfchris at gcc dot gnu.org
2022-02-08  8:13 ` pinskia at gcc dot gnu.org
2022-02-08  8:15 ` tnfchris at gcc dot gnu.org
2022-03-16  8:22 ` cvs-commit at gcc dot gnu.org
2022-03-16  8:23 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).