From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21983 invoked by alias); 24 Nov 2014 12:38:55 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 21907 invoked by uid 48); 24 Nov 2014 12:38:51 -0000 From: "jiwang at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/62173] [5.0 regression] [AArch64] Performance regression due to r213488 Date: Mon, 24 Nov 2014 12:38:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jiwang at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: jiwang at gcc dot gnu.org X-Bugzilla-Target-Milestone: 5.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-11/txt/msg02727.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173 --- Comment #9 from Jiong Wang --- To summary, given the following testcases: case A.C === void bar(int i) { char A[10]; g(A); f(A[i]); } case B.c === void bar(int i) { char A[10]; char B[10]; char C[10]; g(A); g(B); g(C); f(A[i]); f(B[i]); f(C[i]); return; } current code base: * generate sub-optimal code for case A. * generate optimal code for case B, because frame address are rematerialized. I verified *arm/mips also generate the same sub-optimal code layout for case A*, and I believe should be the same for Sebastian's testcase. r213488 bring AArch64 to the correct road then we run into common issue existed on other target also. for any target with FRAME_GROWS_DOWNWARD be 1, the same sub-optimal code layout will be generated, because the base address of the first stack variable will be eliminated into frame + some_minus_value in later stage of LRA which cause it can't be foled with other constant. and after my experimental hack on LEGITIMIZE_ADDRESS to associate stack_var_virtual_rtx with constant offset, then: * generate optimal code for case A. * generate sub-optimal code for case B, because frame address are *not rematerialized*. will do further investigation on this especially the frame address rematerialization after my patch.