From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18883 invoked by alias); 13 Jul 2012 20:00:16 -0000 Received: (qmail 18864 invoked by uid 22791); 13 Jul 2012 20:00:13 -0000 X-SWARE-Spam-Status: No, hits=-3.6 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 13 Jul 2012 20:00:00 +0000 From: "burnus at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/53957] New: Polyhedron 11 benchmark: MP_PROP_DESIGN twice as long as other compiler Date: Fri, 13 Jul 2012 20:00:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: burnus at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-07/txt/msg01091.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53957 Bug #: 53957 Summary: Polyhedron 11 benchmark: MP_PROP_DESIGN twice as long as other compiler Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned@gcc.gnu.org ReportedBy: burnus@gcc.gnu.org [Note that MP_PROP_DESIGN is also discussed at the gcc-graphite mailing list, albeit more with regards to automatic parallelization.] The polyhedron benchmark (2011 version) is available at: http://www.polyhedron.com/polyhedron_benchmark_suite0html, namely: http://www.polyhedron.com/web_images/documents/pb11.zip (The original program, which also contains a ready-to-go benchmark is at http://propdesign.weebly.com/; Note that you may have to rename some input *.txt files to *TXT.) The program takes twice as long with GCC as with ifort. The program is just 502 lines long (w/o comments) and contains no subroutines or functions. It mainly consists of loops and a some math functions (sin, cos, pow, tan, atan, acos, exp). [Result on CentOS 5.7, x86-64-gnu-linux, Intel Xeon X3430 @2.40GHz] Using GCC 4.8.0 20120622 (experimental) [trunk revision 188871], I get: $ gfortran -Ofast -funroll-loops -fwhole-program -march=native mp_prop_design.f90 $ time ./a.out > /dev/null real 2m47.138s user 2m46.808s sys 0m0.236s Using Intel's ifort on Intel(R) 64, Version 12.1 Build 20120212: $ ifort -fast mp_prop_design.f90 $ time ./a.out > /dev/null real 1m25.906s user 1m25.598s sys 0m0.244s With Intel's libimf preloaded (LD_PRELOAD=.../libimf.so), GCC has: real 2m0.524s user 1m59.809s sys 0m0.689s The code features expressions like a**2.0D0, but those are converted in GCC to a*a. Using -mveclibabi=svml (and no preloading) gives the same timings as without (or slightly worse); it just calls vmldAtan2. Vectorizer: I haven't profiled this part, but I want to note that ifort vectorizes more, namely: GCC vectorizes: 662: LOOP VECTORIZED. 1032: LOOP VECTORIZED. 1060: LOOP VECTORIZED. While ifort has: mp_prop_design.f90(271): (col. 10) remark: LOOP WAS VECTORIZED. (Loop "m1 =2, 45" with conditional jump out of the loop) mp_prop_design.f90(552): (col. 16) remark: LOOP WAS VECTORIZED. (Loop with condition) mp_prop_design.f90(576): (col. 16) remark: PARTIAL LOOP WAS VECTORIZED. (Loop with two IF blocks) mp_prop_design.f90(639): (col. 16) remark: LOOP WAS VECTORIZED. (Rather simple loop) mp_prop_design.f90(662): (col. 2) remark: LOOP WAS VECTORIZED. (Vectorized by GCC) mp_prop_design.f90(677): (col. 16) remark: PARTIAL LOOP WAS VECTORIZED. (Line number points to the outermost of the three loops; there are also conditional jumps) mp_prop_design.f90(818): (col. 16) remark: LOOP WAS VECTORIZED. (Nested "if" blocks) mp_prop_design.f90(1032): (col. 2) remark: LOOP WAS VECTORIZED. mp_prop_design.f90(1060): (col. 2) remark: LOOP WAS VECTORIZED. (The last two are handled by GCC)