From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20584 invoked by alias); 21 Feb 2011 03:15:58 -0000 Received: (qmail 20575 invoked by uid 22791); 21 Feb 2011 03:15:57 -0000 X-SWARE-Spam-Status: No, hits=-2.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 21 Feb 2011 03:15:53 +0000 From: "carrot at google dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/47764] The constant load instruction should be hoisted out of loop X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: carrot at google dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Date: Mon, 21 Feb 2011 04:10:00 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2011-02/txt/msg02400.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47764 --- Comment #3 from Carrot 2011-02-21 03:15:45 UTC --- > Any ideas of how this improvement could be implemented, Carrot? The root cause of this problem is that arm/thumb store instruction can't directly store a immediate number to memory, but gcc doesn't realize this early enough. In most part of the rtl phase, the following form is kept. (insn 41 38 42 3 (set (mem:HI (plus:SI (reg/f:SI 169) (const_int 60 [0x3c])) [2 MEM[(struct deflate_state *)D.2085 _3 + 60B]+0 S2 A16]) (const_int 0 [0])) src/trees.c:45 696 {*thumb2_movhi_insn} (expr_list:REG_DEAD (reg/f:SI 169) (nil))) Until register allocation it finds the restriction of the store instruction and split it into two instructions, load 0 into register and store register to memory. But it's too late to do a loop optimization. One possible method is to split this insn earlier than loop optimization (maybe directly in expand pass), and let loop and cse optimizations do the rest. It may increase register pressure in part of the program, we should rematerialize it in such cases.