From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6673 invoked by alias); 19 Dec 2013 18:09:34 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 6628 invoked by uid 48); 19 Dec 2013 18:09:31 -0000 From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/59501] [4.9 Regression] Vector Gather with GCC 4.9 2013-12-08 Snapshot Date: Thu, 19 Dec 2013 18:09:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.9.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: priority bug_status cf_reconfirmed_on cc everconfirmed Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-12/txt/msg01869.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59501 Jakub Jelinek changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 |P1 Status|UNCONFIRMED |NEW Last reconfirmed| |2013-12-19 CC| |hjl at gcc dot gnu.org, | |hubicka at gcc dot gnu.org, | |jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- This regressed with r203171. Before that change, -maccumulate-outgoing-args was true, but now it isn't. The changes I see in the RTL dumps is that there is a (dead) load from r10 register into a pseudo from expand to jump pass, then the RTL is pretty much the same (different insn numbers) until pro_and_epilogue, which creates all the garbage. The reason why the load from r10 is created and supposedly for the different pro_and_epilogue behavior is ix86_get_drap_rtx: if (ix86_force_drap || !ACCUMULATE_OUTGOING_ARGS) crtl->need_drap = true; But in the function in question, LRA has not spilled anything to the stack, the stack actually isn't used at all, and neither is the drap reg live at the start of the function (that would be another reason why we'd need to emit some setting of the drap reg, but probably wouldn't need to dynamically realign the stack).