From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17882 invoked by alias); 3 Nov 2014 15:40:29 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 17853 invoked by uid 48); 3 Nov 2014 15:40:25 -0000 From: "ramana at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/63724] New: [AArch64] Inefficient immediate expansion and hoisting. Date: Mon, 03 Nov 2014 15:40:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: ramana at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-11/txt/msg00123.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63724 Bug ID: 63724 Summary: [AArch64] Inefficient immediate expansion and hoisting. Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ramana at gcc dot gnu.org For some cases like hmmer in SPEC2k6 we currently generate pretty rubbish code with AArch64. float P7Viterbi(int **mmx, int L, int M, int **imx, int **dmx) { int k; for (k = 0; k <= M; k++) mmx[0][k] = imx[0][k] = dmx[0][k] = -987654321; } This ends up generating pretty rubbish code at O2. tbnz w2, #31, .L4 ldr x5, [x3] ldr x4, [x4] ldr x6, [x0] mov x0, 0 .L3: mov w1, 38735 mov w3, w1 movk w1, 0xc521, lsl 16 str w1, [x4, x0, lsl 2] movk w3, 0xc521, lsl 16 mov w1, 38735 str w3, [x5, x0, lsl 2] movk w1, 0xc521, lsl 16 str w1, [x6, x0, lsl 2] add x0, x0, 1 cmp w2, w0 bge .L3 .L4: fmov s0, wzr ret .size P7Viterbi, .-P7Viterbi and could well be P7Viterbi: tbnz w2, #31, .L4 ldr x5, [x3] mov w1, 38735 ldr x3, [x4] movk w1, 0xc521, lsl 16 ldr x6, [x0] mov x0, 0 .L3: str w1, [x3, x0, lsl 2] str w1, [x5, x0, lsl 2] str w1, [x6, x0, lsl 2] add x0, x0, 1 cmp w2, w0 bge .L3 .L4: fmov s0, wzr ret .size P7Viterbi, .-P7Viterbi The hoisting is missed because we expand const_int's too early in the AArch64 backend. Given we don't have an "uncse" in the mid-end it's quite hard to recover when we've expanded to this form rather early in the compiler. The simple solution is just to move the logic out into a separate splitter function, additionally we should also investigate what happens if we start doing the same for our address computations, but that's the subject of a separate patch. Mine.