From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7963 invoked by alias); 10 Jan 2006 17:50:06 -0000 Received: (qmail 7923 invoked by uid 48); 10 Jan 2006 17:50:04 -0000 Date: Tue, 10 Jan 2006 17:50:00 -0000 Message-ID: <20060110175004.7922.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/21715] [4.0/4.1 regression] code-generation performance regression In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "steven at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2006-01/txt/msg00965.txt.bz2 List-Id: ------- Comment #8 from steven at gcc dot gnu dot org 2006-01-10 17:50 ------- The new reassociation pass, or the removal of DOM's reassociation bits, fixed this on the trunk. We get poorer initial RTL generation out of GCC 4.1 and we never manage to fix it up: The .final_cleanup from GCC 4.1 and GCC 4.0: ;; Function foo (foo) foo (v) { : return v & -v; } And the .final_cleanup from GCC 4.2: ;; Function foo (foo) foo (v) { : return -v & v; } (insn 12 11 13 (parallel [ (set (reg:DI 60) - (and:DI (reg/v:DI 59 [ v ]) - (reg:DI 61))) + (and:DI (reg:DI 61) + (reg/v:DI 59 [ v ]))) (clobber (reg:CC 17 flags)) ]) -1 (nil) (nil)) So this regression is not caused by the register allocator, but it does play a role: In the .combine and .ce2 RTL dumps, the difference is still there: (insn 12 11 16 (parallel [ (insn 12 11 16 (parallel [ (set (reg:DI 60) - (and:DI (reg/v:DI 59 [ v ]) - (reg:DI 61))) + (and:DI (reg:DI 61) + (reg/v:DI 59 [ v ]))) (clobber (reg:CC 17 flags)) - (expr_list:REG_DEAD (reg/v:DI 59 [ v ]) - (expr_list:REG_DEAD (reg:DI 61) + (expr_list:REG_DEAD (reg:DI 61) + (expr_list:REG_DEAD (reg/v:DI 59 [ v ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))))) Then in the .regmove RTL dump something changes: (insn:HI 12 11 16 (parallel [ - (set (reg/v:DI 59 [ v ]) - (and:DI (reg/v:DI 59 [ v ]) - (reg:DI 61))) + (set (reg:DI 61) + (and:DI (reg:DI 61) + (reg/v:DI 59 [ v ]))) (clobber (reg:CC 17 flags)) ]) 297 {*anddi_1_rex64} (insn_list:REG_DEP_TRUE 11 (nil)) - (expr_list:REG_DEAD (reg:DI 61) + (expr_list:REG_DEAD (reg/v:DI 59 [ v ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)))) This small difference eventually leads to a different choice of register allocation. The choice that GCC 4.2 makes is superior because it makes the move to the result a dead instruction. The .greg RTL dump shows this: -(insn:HI 12 11 16 0 (parallel [ - (set (reg/v:DI 5 di [orig:59 v ] [59]) - (and:DI (reg/v:DI 5 di [orig:59 v ] [59]) - (reg:DI 0 ax [61]))) +(insn:HI 12 11 16 2 (parallel [ + (set (reg:DI 0 ax [61]) + (and:DI (reg:DI 0 ax [61]) + (reg/v:DI 5 di [orig:59 v ] [59]))) (clobber (reg:CC 17 flags)) ]) 297 {*anddi_1_rex64} (insn_list:REG_DEP_TRUE 11 (nil)) (nil)) -(insn:HI 19 16 25 0 (set (reg/i:DI 0 ax [ ]) - (reg/v:DI 5 di [orig:59 v ] [59])) 81 {*movdi_1_rex64} - (insn_list:REG_DEP_TRUE 12 (nil)) - (nil)) -- steven at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- BugsThisDependsOn|18427 | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21715