From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-260765-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 21394 invoked by alias); 4 Sep 2008 19:26:38 -0000
Received: (qmail 19701 invoked by uid 48); 4 Sep 2008 19:25:17 -0000
Date: Thu, 04 Sep 2008 19:26:00 -0000
Message-ID: <20080904192517.19700.qmail@sourceware.org>
X-Bugzilla-Reason: CC
References: <bug-37364-682@http.gcc.gnu.org/bugzilla/>
Subject: [Bug middle-end/37364] [4.4 Regression] IRA generates inefficient code
In-Reply-To: <bug-37364-682@http.gcc.gnu.org/bugzilla/>
Reply-To: gcc-bugzilla@gcc.gnu.org
To: gcc-bugs@gcc.gnu.org
From: "vmakarov at redhat dot com" <gcc-bugzilla@gcc.gnu.org>
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2008-09/txt/msg00571.txt.bz2


------- Comment #6 from vmakarov at redhat dot com  2008-09-04 19:25 -------
  First of all, I've check the generated code on Core2 and I found it is not
slower than using movd.

  IRA assigns hard registers calculating their costs.  It the memory is
cheaper, it assigns memory.  The first decision point to use memory or hard
register is made in ira-costs.c.  The second decision point is ira-color
because the cost can changed dynamically (e.g. cheap hard register are not
available).

  In this case IRA decides to use memory instead of MMX reg in ira-cost.c

  a0(r61,l0) costs: AREG:25000,25000 DREG:25000,25000 CREG:25000,25000
BREG:25000,25000 SIREG:25000,25000 DIREG:25000,25000 AD_REGS:25000,25000
Q_REGS:25000,25000 NON_Q_REGS:25000,25000 LEGACY_REGS:25000,25000
GENERAL_REGS:25000,25000 SSE_FIRST_REG:42000,42000 SSE_REGS:42000,42000
MMX_REGS:25000,25000 MEM:23000

  The memory is cheaper than MMX_REG therefore r61 gets NO_REGS as cover class
which means using memory.

  The reason for this is in insn

(insn:HI 14 8 20 2
/home/cygnus/vmakarov/build/ira-merge-branch/gcc/gcc/testsuite/gcc.target/i386/pr34256.c:12
(set (reg/i:DI 0 ax)
        (subreg:DI (reg:V2SI 61) 0)) 89 {*movdi_1_rex64} (expr_list:REG_DEAD
(reg:V2SI 61)
        (nil)))

which has the following description

(define_insn "*movdi_1_rex64"
  [(set (match_operand:DI 0 "nonimmediate_operand"
          "=r,r  ,r,m ,!m,*y,*y,?r ,m ,?*Ym,?*y,*x,*x,?r ,m,?*Yi,*x,?*x,?*Ym")
        (match_operand:DI 1 "general_operand"
          "Z ,rem,i,re,n ,C ,*y,*Ym,*y,r   ,m  ,C ,*x,*Yi,*x,r  ,m ,*Ym,*x"))]
  "TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))"

Please, look at the 8th alternatives ?r *Ym which corresponds to GENERAL_REGS
MMX_REGS.  ? makes the alternative expensive. * is even worse because it
excludes the alternative from register cost calculation.

  The old register allocator would behave the same way if regmove did not
coalesced r61 with another pseudo and the result pseudo had not MMX cost
cheaper than memory.

  There are several ways to fix the problem:

o ignore * and ? in the cost calculation
o fix the pattern
o run regmove as the old RA
o make failure expected

The first solution would result in huge performance regression for practically
any program.

I can not say about the 2nd solution because I am not the port maintainer.

The third solution is bad because in general case IRA does coalescing more
smart on the fly besides it could make RA even slower.

So I'd prefer the last solution.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37364