public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/51017] New: GCC 4.6 performance regression (vs. 4.4/4.5)
@ 2011-11-08  0:43 solar-gcc at openwall dot com
  2011-11-08  0:57 ` [Bug middle-end/51017] " solar-gcc at openwall dot com
                   ` (29 more replies)
  0 siblings, 30 replies; 31+ messages in thread
From: solar-gcc at openwall dot com @ 2011-11-08  0:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017

             Bug #: 51017
           Summary: GCC 4.6 performance regression (vs. 4.4/4.5)
    Classification: Unclassified
           Product: gcc
           Version: 4.6.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: solar-gcc@openwall.com


GCC 4.6 happens to produce approx. 25% slower code on at least x86_64 than 4.4
and 4.5 did for John the Ripper 1.7.8's bitslice DES implementation.  To
reproduce, download
http://download.openwall.net/pub/projects/john/1.7.8/john-1.7.8.tar.bz2 and
build it with "make linux-x86-64" (will use SSE2 intrinsics), "make
linux-x86-64-avx" (will use AVX instead), or "make generic" (won't use any
intrinsics).  Then run "../run/john -te=1".  With GCC 4.4 and 4.5, the
"Traditional DES" benchmark reports a speed of around 2500K c/s for the
"linux-x86-64" (SSE2) build on a 2.33 GHz Core 2 (this is using one core). 
With 4.6, this drops to about 1850K c/s.  Similar slowdown was observed for AVX
on Core i7-2600K when going from GCC 4.5.x to 4.6.x.  And it is reproducible
for the without-intrinsics code as well, although that's of less practical
importance (the intrinsics are so much faster).  Similar slowdown with GCC 4.6
was reported by a Mac OS X user.  It was also spotted by Phoronix in their
recently published C compiler benchmarks, but misinterpreted as a GCC vs. clang
difference.

Adding "-Os" to OPT_INLINE in the Makefile partially corrects the performance
(to something like 2000K c/s - still 20% slower than GCC 4.4/4.5's).  Applying
the OpenMP patch from
http://download.openwall.net/pub/projects/john/1.7.8/john-1.7.8-omp-des-4.diff.gz
and then running with OMP_NUM_THREADS=1 (for a fair comparison) corrects the
performance almost fully.  Keeping the patch applied, but removing -fopenmp
still keeps the performance at a good level.  So it's some change made to the
source code by this patch that mitigates the GCC regression.  Similar behavior
is seen with current CVS version of John the Ripper, even though it has OpenMP
support for DES heavily revised and integrated into the tree.


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-07-07 10:29 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-08  0:43 [Bug middle-end/51017] New: GCC 4.6 performance regression (vs. 4.4/4.5) solar-gcc at openwall dot com
2011-11-08  0:57 ` [Bug middle-end/51017] " solar-gcc at openwall dot com
2011-11-08  1:05 ` solar-gcc at openwall dot com
2011-12-15  0:34 ` pinskia at gcc dot gnu.org
2012-01-03  4:46 ` solar-gcc at openwall dot com
2012-01-04 19:39 ` solar-gcc at openwall dot com
2012-01-04 22:43 ` jakub at gcc dot gnu.org
2012-01-04 23:00 ` solar-gcc at openwall dot com
2015-02-09  0:12 ` pinskia at gcc dot gnu.org
2015-02-16  0:08 ` solar-gcc at openwall dot com
2015-02-16  1:10 ` solar-gcc at openwall dot com
2015-02-16 10:51 ` [Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure rguenth at gcc dot gnu.org
2015-02-17  2:21 ` solar-gcc at openwall dot com
2015-02-17  2:56 ` solar-gcc at openwall dot com
2015-02-17  3:11 ` solar-gcc at openwall dot com
2015-02-17  9:25 ` rguenth at gcc dot gnu.org
2015-02-17  9:27 ` rguenth at gcc dot gnu.org
2015-02-18  0:03 ` solar-gcc at openwall dot com
2015-02-18  1:25 ` solar-gcc at openwall dot com
2015-02-18  3:20 ` solar-gcc at openwall dot com
2015-02-18 10:32 ` [Bug tree-optimization/51017] [4.8/4.9/5 Regression] GCC performance regression (vs. 4.4/4.5), PRE increases register pressure too much rguenth at gcc dot gnu.org
2015-02-18 11:09 ` rguenth at gcc dot gnu.org
2015-02-25 14:26 ` law at redhat dot com
2015-06-23  8:14 ` [Bug tree-optimization/51017] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
2015-06-26 20:04 ` [Bug tree-optimization/51017] [4.9/5/6 " jakub at gcc dot gnu.org
2015-06-26 20:33 ` jakub at gcc dot gnu.org
2021-05-14  9:46 ` [Bug tree-optimization/51017] [9/10/11/12 Regression] GCC performance regression (vs. 4.4/4.5), PRE/LIM increase " jakub at gcc dot gnu.org
2021-06-01  8:05 ` rguenth at gcc dot gnu.org
2022-05-27  9:34 ` [Bug tree-optimization/51017] [10/11/12/13 " rguenth at gcc dot gnu.org
2022-06-28 10:30 ` jakub at gcc dot gnu.org
2023-07-07 10:29 ` [Bug tree-optimization/51017] [11/12/13/14 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).