From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21394 invoked by alias); 4 Sep 2008 19:26:38 -0000 Received: (qmail 19701 invoked by uid 48); 4 Sep 2008 19:25:17 -0000 Date: Thu, 04 Sep 2008 19:26:00 -0000 Message-ID: <20080904192517.19700.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug middle-end/37364] [4.4 Regression] IRA generates inefficient code In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "vmakarov at redhat dot com" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2008-09/txt/msg00571.txt.bz2 ------- Comment #6 from vmakarov at redhat dot com 2008-09-04 19:25 ------- First of all, I've check the generated code on Core2 and I found it is not slower than using movd. IRA assigns hard registers calculating their costs. It the memory is cheaper, it assigns memory. The first decision point to use memory or hard register is made in ira-costs.c. The second decision point is ira-color because the cost can changed dynamically (e.g. cheap hard register are not available). In this case IRA decides to use memory instead of MMX reg in ira-cost.c a0(r61,l0) costs: AREG:25000,25000 DREG:25000,25000 CREG:25000,25000 BREG:25000,25000 SIREG:25000,25000 DIREG:25000,25000 AD_REGS:25000,25000 Q_REGS:25000,25000 NON_Q_REGS:25000,25000 LEGACY_REGS:25000,25000 GENERAL_REGS:25000,25000 SSE_FIRST_REG:42000,42000 SSE_REGS:42000,42000 MMX_REGS:25000,25000 MEM:23000 The memory is cheaper than MMX_REG therefore r61 gets NO_REGS as cover class which means using memory. The reason for this is in insn (insn:HI 14 8 20 2 /home/cygnus/vmakarov/build/ira-merge-branch/gcc/gcc/testsuite/gcc.target/i386/pr34256.c:12 (set (reg/i:DI 0 ax) (subreg:DI (reg:V2SI 61) 0)) 89 {*movdi_1_rex64} (expr_list:REG_DEAD (reg:V2SI 61) (nil))) which has the following description (define_insn "*movdi_1_rex64" [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r ,r,m ,!m,*y,*y,?r ,m ,?*Ym,?*y,*x,*x,?r ,m,?*Yi,*x,?*x,?*Ym") (match_operand:DI 1 "general_operand" "Z ,rem,i,re,n ,C ,*y,*Ym,*y,r ,m ,C ,*x,*Yi,*x,r ,m ,*Ym,*x"))] "TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))" Please, look at the 8th alternatives ?r *Ym which corresponds to GENERAL_REGS MMX_REGS. ? makes the alternative expensive. * is even worse because it excludes the alternative from register cost calculation. The old register allocator would behave the same way if regmove did not coalesced r61 with another pseudo and the result pseudo had not MMX cost cheaper than memory. There are several ways to fix the problem: o ignore * and ? in the cost calculation o fix the pattern o run regmove as the old RA o make failure expected The first solution would result in huge performance regression for practically any program. I can not say about the 2nd solution because I am not the port maintainer. The third solution is bad because in general case IRA does coalescing more smart on the fly besides it could make RA even slower. So I'd prefer the last solution. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37364