From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25945 invoked by alias); 17 May 2013 03:06:00 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 25877 invoked by uid 48); 17 May 2013 03:05:55 -0000 From: "wschmidt at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/57309] New: Spill code degrades vectorized loop for 437.leslie3d on PPC64 Date: Fri, 17 May 2013 03:06:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: wschmidt at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter cc cf_gcchost cf_gcctarget cf_gccbuild Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-05/txt/msg01162.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57309 Bug ID: 57309 Summary: Spill code degrades vectorized loop for 437.leslie3d on PPC64 Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: wschmidt at gcc dot gnu.org CC: bergner at vnet dot ibm.com Host: powerpc*-*-* Target: powerpc*-*-* Build: powerpc*-*-* Note: This bug does NOT occur on current trunk. To reproduce, it's necessary to patch config/rs6000/rs6000.h so that MALLOC_ABI_ALIGNMENT is defined as: #define MALLOC_ABI_ALIGNMENT (TARGET_64BIT ? 128 : 64) This allows more vectorization opportunities for loops that access malloc'd arrays that can be vectorized with 128-bit vectors. I observed that making this change introduces a degradation of SPEC CPU2006 437.leslie3d, built for 64-bit PowerPC Linux. There are a number of degraded loops in the code, which seem to all be pretty similar. In all cases the loops are vectorized with and without the patch, but with the patch there is no need for prolog code to align the data. Unfortunately, with the patch, the loops also contain a great deal of spill code (ld, addi, lxvd2x, stxvd2x) which reloads not only vector registers, but also GPRs used for address computation of vector loads and stores. Without the spill code, the main loop body would be vectorized identically with and without the patch. One of the worst degraded loops is in function fluxk. I have oprofile data available to identify the loop as well as some dumps showing how the loop is transformed in various phases, available by request.