[Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers
@ 2015-04-23 16:04 robert.suchanek at imgtec dot com
  2015-04-23 16:09 ` [Bug rtl-optimization/65862] " pinskia at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: robert.suchanek at imgtec dot com @ 2015-04-23 16:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

            Bug ID: 65862
           Summary: [MIPS] IRA/LRA issue: integers spilled to
                    floating-point registers
           Product: gcc
           Version: 5.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: robert.suchanek at imgtec dot com
                CC: matthew.fortune at imgtec dot com, vmakarov at redhat dot com,
                    wdijkstr at arm dot com

Following up the following thread:
https://gcc.gnu.org/ml/gcc/2015-04/msg00239.html

Here is a reduced testcase from the Linux kernel:

$ cat sort.c
int a, c;
int *b;
void
fn1(int p1, int *p2(void *, void *), void *p3(void *, void *, int)) {
  int n = c;
  for (;;) {
    a = 1;
    for (; a < n;) {
      p1 && p2(0, (int *) (p1 + 1));
      p3(0, b + p1, 0);
    }
  }
}

Spill/reload to/from FP reg should be triggerable with (tested on SVN rev.
222257):
$ mips-img-linux-gnu -mips32r6 -O2 sort.c 

Because of ALL_REGS assigned to most of allocnos, LRA uses FP regs freely. The
class is preferred because of the equal cost between registers and memory. This
likely happened because of the following fix:

2011-12-20  Vladimir Makarov  <vmakarov@redhat.com>                  

    PR target/49865
    * ira-costs.c (find_costs_and_classes): Prefer registers even 
      if the memory cost is the same.                            

As Matthew already pointed out, one way to prevent this is through increasing
the cost of moving between GP and FP registers for integral modes.

I briefly tested out Wilco's patch but it did not appear to have the same
effect as changing the cost and I've seen a few ICEs when building the kernel.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
@ 2015-04-23 16:09 ` pinskia at gcc dot gnu.org
  2015-04-23 16:54 ` robert.suchanek at imgtec dot com
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-04-23 16:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The kernel should have been compiled with -msoft-float and I thought it was.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
  2015-04-23 16:09 ` [Bug rtl-optimization/65862] " pinskia at gcc dot gnu.org
@ 2015-04-23 16:54 ` robert.suchanek at imgtec dot com
  2015-04-27 16:17 ` vmakarov at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: robert.suchanek at imgtec dot com @ 2015-04-23 16:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #2 from Robert Suchanek <robert.suchanek at imgtec dot com> ---
That's correct. It was just easier to expose this problem by compiling the
kernel.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
  2015-04-23 16:09 ` [Bug rtl-optimization/65862] " pinskia at gcc dot gnu.org
  2015-04-23 16:54 ` robert.suchanek at imgtec dot com
@ 2015-04-27 16:17 ` vmakarov at gcc dot gnu.org
  2015-04-27 17:13 ` wdijkstr at arm dot com
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-04-27 16:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

Vladimir Makarov <vmakarov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #3 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
  The result of the old patch which makes ALL_REGS available changes
the order of coloring and, as a consequence, the result of allocation for r218
and r220 after IRA.  That is unfortunate result of heuristics approaches.

Before the patch:
      ...
      Popping a4(r218,l0)  -- assign reg 30
      Popping a0(r220,l0)  -- spill
      ...
Disposition:
    ... 4:r218 l0    30    6:r219 l0    16    0:r220 l0   mem

After patch:
      Popping a0(r220,l0)  -- assign reg 30
      Popping a4(r218,l0)  -- (memory is more profitable 184 vs 191) spill!
      ...
Disposition:
    ... 4:r218 l0   mem    6:r219 l0    16    0:r220 l0    30

The costs of MEM and FP_REGS are the same as for example r218 occurs in 2
insns:

59: r218:SI=r194:SI<=0x1
16: pc={(r218:SI!=0)?L18:pc}

The costs are equal if cost of moving general regs to/from fp regs or
memory are equal.  So it looks ok to me.

r218 spilled in IRA is reassigned to a fp reg in *LRA*.  It could be
changed if we used only preferred class in LRA for this.  In this
case, r218 stays spilled and we remove one insn (saving the allocated
fp reg):

        sdc1    $f20,64($sp)

I am not sure, that the result code is better as we access memory 3
times instead of access to $f20.

But I could try to use preferred class in LRA (after checking how it
affects x86/x86-64 performance), if such solution is ok for you.

But I can not just revert the patch making ALL_REGS available to make
coloring heuristic more fotunate for your particular case, as it
reopens the old PR for which the patch was created and i have no other
solutions for the old PR.

Robert, please let me know what do you think.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (2 preceding siblings ...)
  2015-04-27 16:17 ` vmakarov at gcc dot gnu.org
@ 2015-04-27 17:13 ` wdijkstr at arm dot com
  2015-05-06 16:13 ` robert.suchanek at imgtec dot com
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: wdijkstr at arm dot com @ 2015-04-27 17:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #4 from Wilco <wdijkstr at arm dot com> ---
(In reply to Vladimir Makarov from comment #3)

But I can not just revert the patch making ALL_REGS available
> to make
coloring heuristic more fotunate for your particular case, as it
> reopens the old PR for which the patch was created and i have no other
> solutions for the old PR.

I tried reverting the ALL_REGS patch and I don't see any regressions - in fact
allocations are slightly better (fewer registers with ALL_REGS preference which
is what we need - a strong decision to allocate to either FP or int regs). So
what was the motivation for it?

Note that it would be trivial to prefer ALL_REGS for some operands if
necessary. The only case I can imagine is load and store which on some targets
are quite orthogonal. I tried doing m=r#w and m=w#r on AArch64 and that works
fine (this tells the preferencing code that ALL_REGS is best but it still keeps
a clear INT/FP separation in the patterns which you may need for disassembly
etc).

IMHO that is a better solution than to automatically change the patterns r=r+r;
w=w+w into rw=rw+rw and assume that ALL_REGS preference on all operands has
zero cost eventhough the cost calculation explicitly states otherwise.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (3 preceding siblings ...)
  2015-04-27 17:13 ` wdijkstr at arm dot com
@ 2015-05-06 16:13 ` robert.suchanek at imgtec dot com
  2015-05-06 16:21 ` matthew.fortune at imgtec dot com
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: robert.suchanek at imgtec dot com @ 2015-05-06 16:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #5 from Robert Suchanek <robert.suchanek at imgtec dot com> ---
Sorry for late reply, I was on vacation.

> The costs are equal if cost of moving general regs to/from fp regs or
> memory are equal.  So it looks ok to me.
> 
> r218 spilled in IRA is reassigned to a fp reg in *LRA*.  

> But I could try to use preferred class in LRA (after checking how it
> affects x86/x86-64 performance), if such solution is ok for you.

Indeed, the above test case only shows the problem in LRA. If the preferred
class would be the winner then why not. However, there are still some issues
with IRA and I have another testcase to show it.

> I am not sure, that the result code is better as we access memory 3
> times instead of access to $f20.

On one hand, yes, it seems good but it's not always desirable to use FP regs
until absolutely necessary. For instance, compiling the dynamic linker that
uses FP regs does not seem to be right.

I had another thought about spilling into registers and how we could guarantee
spilling into the desirable class. In the majority of cases where integers end
up
in floating-point registers, I see the following in the dumps:
...
        Reassigning non-reload pseudos
                 Assign 52 to r217 (freq=46)
...

This introduced the use of FP registers (in lra-assigns.c):
...
if (n != 0 && lra_dump_file != NULL)                                
   fprintf (lra_dump_file, "  Reassigning non-reload pseudos\n");    
 qsort (sorted_pseudos, n, sizeof (int), pseudo_compare_func);       
 for (i = 0; i < n; i++)                                             
   {                                                                 
     regno = sorted_pseudos[i];                                      
     hard_regno = find_hard_regno_for (regno, &cost, -1, false);     
     if (hard_regno >= 0)                        
       ...                                                            
     else                                                            
       ...                                                                      
   }      
...

find_hard_regno_for chooses the FP registers freely because of allocno class
has ALL_REGS.

With a quick hack in the if conditional to skip the body for pseudos spilled to
memory:

        ...
        if (hard_regno >= 0 && ! in_mem_p (regno))     
        ...

forces the use of the TARGET_SPILL_CLASS hook and resolves spilling to FP regs
in over 95% cases but not entirely. In terms of the code size, this change had
a minor improvement on average case. Would this approach be the correct way to
guarantee spilling to the desired class?

In the remaining 5% cases, IRA assigns FP regs with LRA blindly following IRA's
decisions like in the following reduced case:

int a, b, d, e, j, k, n, o;
unsigned c, h, i, l, m, p;
int *f;
int *g;
int fn1(int p1) { return p1 - a; }

int fn2() {
  b = b + 1 - a;
  e = 1 + o + 1518500249;
  d = d + n;
  c = (int)c + g[0];
  b = b + m + 1;
  d = d + p + 1518500249;
  d = d + k - 1;
  c = fn1(c + j + 1518500249);
  e = fn1(e + i + 1);
  d = d + h + 1859775393 - a;
  c = fn1(c + (d ^ 1 ^ b) + g[1] + 1);
  b = fn1(b + m + 3);
  d = fn1(d + l + 1);
  b = b + (c ^ 1) + p + 1;
  e = fn1(e + (b ^ c ^ d) + n + 1);
  d = o;
  b = 0;
  e = e + k + 1859775393;
  f[0] = e;
}

I'm not sure how this could be fixed in LRA and again this is related to
ALL_REGS for allocnos. Perhaps changing the class for reloads to the spill
class in LRA would do the trick but it may have other problems.
My last attempt was to increase the cost of FP_REGS in IRA for integral modes
(similar effect to increasing the costs of moving FP<>GR in the backend) but
the cost pass looks complicated and I'm not entirely sure where to tweak it.
Any suggestions/ideas?

> I tried reverting the ALL_REGS patch and I don't see any regressions - in
> fact allocations are slightly better (fewer registers with ALL_REGS
> preference which is what we need - a strong decision to allocate to either
> FP or int regs). So what was the motivation for it?

AFAICS, the aim was to fix the code generation regression for x86. x86 doesn't
seem to be as much affected as others. I did not notice code size differences
with -O2 and default arch for x86_64-unknown-linux-gnu triplet and CSiBE
benchmark, -Os showed some minor improvements/regression with the largest
difference in mpeg2dec-0.3.1 yielding ~0.3% improvement. I haven't evaluated
performance changes.

For MIPS, I also saw allocation improvements, more erratic than x86 with
improvement about 0.5% on average. Reverting the patch does bring the old issue
back but I wonder what is the impact of it and whether it is a justifiable fix
to the extent it outweights the disadvantages. Or maybe the original problem
could be fixed differently?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (4 preceding siblings ...)
  2015-05-06 16:13 ` robert.suchanek at imgtec dot com
@ 2015-05-06 16:21 ` matthew.fortune at imgtec dot com
  2015-05-07 18:55 ` vmakarov at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: matthew.fortune at imgtec dot com @ 2015-05-06 16:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #6 from Matthew Fortune <matthew.fortune at imgtec dot com> ---
(In reply to Robert Suchanek from comment #5)
> > I am not sure, that the result code is better as we access memory 3
> > times instead of access to $f20.
> 
> On one hand, yes, it seems good but it's not always desirable to use FP regs
> until absolutely necessary. For instance, compiling the dynamic linker that
> uses FP regs does not seem to be right.

On the costs front then while it is true that moves between FPR and GPR are
usually cheaper than moving to memory and back there is a secondary cost which
is that simply turning on the FPU is costly. So, the reason for using FPRs
needs to be that the floating point instructions are used rather than
registers. Ideally we would not spill to FPRs unless there has been actual
floating point code used, this suggests it would be good to set the cost of
GPR->FPR higher than memory if no floating point code is present in the
function. Otherwise if the FPU is in use anyway then using FPRs as
scratch/spill for integer mode data is fine and the costs can be lower than
memory. 

Matthew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (5 preceding siblings ...)
  2015-05-06 16:21 ` matthew.fortune at imgtec dot com
@ 2015-05-07 18:55 ` vmakarov at gcc dot gnu.org
  2015-05-08 18:29 ` vmakarov at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-05-07 18:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #7 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Robert Suchanek from comment #5)
> Sorry for late reply, I was on vacation.
> 
> > The costs are equal if cost of moving general regs to/from fp regs or
> > memory are equal.  So it looks ok to me.
> > 
> > r218 spilled in IRA is reassigned to a fp reg in *LRA*.  
> 
> > But I could try to use preferred class in LRA (after checking how it
> > affects x86/x86-64 performance), if such solution is ok for you.
> 
> Indeed, the above test case only shows the problem in LRA. If the preferred
> class would be the winner then why not. However, there are still some issues
> with IRA and I have another testcase to show it.
> 
> > I am not sure, that the result code is better as we access memory 3
> > times instead of access to $f20.
> 
> On one hand, yes, it seems good but it's not always desirable to use FP regs
> until absolutely necessary. For instance, compiling the dynamic linker that
> uses FP regs does not seem to be right.
> 
> I had another thought about spilling into registers and how we could
> guarantee
> spilling into the desirable class. In the majority of cases where integers
> end up
> in floating-point registers, I see the following in the dumps:
> ...
> 	Reassigning non-reload pseudos
> 	         Assign 52 to r217 (freq=46)
> ...
> 

If we use preferred class instead of allocno one, the problem goes away.  Of
course, if IRA does not assign ALL_REGS to the preferred class.

> 
> In the remaining 5% cases, IRA assigns FP regs with LRA blindly following
> IRA's decisions like in the following reduced case:
> 

I guess the same would happen for reload pass too, as it reassigns pseudos
spilled in reload when IRA assigns ALL_REGS to preferred class.

> int a, b, d, e, j, k, n, o;
> unsigned c, h, i, l, m, p;
> int *f;
> int *g;
> int fn1(int p1) { return p1 - a; }
> 
> int fn2() {
>   b = b + 1 - a;
>   e = 1 + o + 1518500249;
>   d = d + n;
>   c = (int)c + g[0];
>   b = b + m + 1;
>   d = d + p + 1518500249;
>   d = d + k - 1;
>   c = fn1(c + j + 1518500249);
>   e = fn1(e + i + 1);
>   d = d + h + 1859775393 - a;
>   c = fn1(c + (d ^ 1 ^ b) + g[1] + 1);
>   b = fn1(b + m + 3);
>   d = fn1(d + l + 1);
>   b = b + (c ^ 1) + p + 1;
>   e = fn1(e + (b ^ c ^ d) + n + 1);
>   d = o;
>   b = 0;
>   e = e + k + 1859775393;
>   f[0] = e;
> }
> 
> I'm not sure how this could be fixed in LRA and again this is related to
> ALL_REGS for allocnos. Perhaps changing the class for reloads to the spill
> class in LRA would do the trick but it may have other problems.
> My last attempt was to increase the cost of FP_REGS in IRA for integral
> modes (similar effect to increasing the costs of moving FP<>GR in the
> backend) but the cost pass looks complicated and I'm not entirely sure where
> to tweak it. Any suggestions/ideas?
> 

Currently, I see the solution in introducing a target hook which can narrow
allocno class for given pseudo in IRA.  For pseudo of non fp-mode, it should
narrow ALL_REGS to general regs in MIPS case.

Actually, I was already asked for such hook from ARM people but their case was
not convincible. 


> > I tried reverting the ALL_REGS patch and I don't see any regressions - in
> > fact allocations are slightly better (fewer registers with ALL_REGS
> > preference which is what we need - a strong decision to allocate to either
> > FP or int regs). So what was the motivation for it?
> 
> AFAICS, the aim was to fix the code generation regression for x86. x86
> doesn't seem to be as much affected as others. I did not notice code size
> differences with -O2 and default arch for x86_64-unknown-linux-gnu triplet
> and CSiBE benchmark, -Os showed some minor improvements/regression with the
> largest difference in mpeg2dec-0.3.1 yielding ~0.3% improvement. I haven't
> evaluated performance changes.
> 
> For MIPS, I also saw allocation improvements, more erratic than x86 with
> improvement about 0.5% on average. Reverting the patch does bring the old
> issue back but I wonder what is the impact of it and whether it is a
> justifiable fix to the extent it outweights the disadvantages. Or maybe the
> original problem could be fixed differently?

I'll try to investigate this more.  But first, I'd like to make a patch for the
new hook in order you evaluate it usefulness for MIPS.  I hope to make it and
send it to you tomorrow.
>From gcc-bugs-return-485783-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu May 07 18:58:22 2015
Return-Path: <gcc-bugs-return-485783-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 21915 invoked by alias); 7 May 2015 18:58:22 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 19685 invoked by uid 48); 7 May 2015 18:58:18 -0000
From: "vmakarov at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
Date: Thu, 07 May 2015 18:58:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 5.1.1
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: vmakarov at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-65862-4-U1L5w2fAPB@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-65862-4@http.gcc.gnu.org/bugzilla/>
References: <bug-65862-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-05/txt/msg00623.txt.bz2
Content-length: 1491

https://gcc.gnu.org/bugzilla/show_bug.cgi?ide862

--- Comment #8 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Matthew Fortune from comment #6)
> (In reply to Robert Suchanek from comment #5)
> > > I am not sure, that the result code is better as we access memory 3
> > > times instead of access to $f20.
> >
> > On one hand, yes, it seems good but it's not always desirable to use FP regs
> > until absolutely necessary. For instance, compiling the dynamic linker that
> > uses FP regs does not seem to be right.
>
> On the costs front then while it is true that moves between FPR and GPR are
> usually cheaper than moving to memory and back there is a secondary cost
> which is that simply turning on the FPU is costly. So, the reason for using
> FPRs needs to be that the floating point instructions are used rather than
> registers. Ideally we would not spill to FPRs unless there has been actual
> floating point code used, this suggests it would be good to set the cost of
> GPR->FPR higher than memory if no floating point code is present in the
> function. Otherwise if the FPU is in use anyway then using FPRs as
> scratch/spill for integer mode data is fine and the costs can be lower than
> memory.

Thanks, Matt.  It is a good point.  As I wrote in my previous comment,
introducing target hook narrowing allocno class could help here.  Hook
implementation for MIPS could change ALL_REGS to general regs unless the pseudo
is of a floating point mode.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (6 preceding siblings ...)
  2015-05-07 18:55 ` vmakarov at gcc dot gnu.org
@ 2015-05-08 18:29 ` vmakarov at gcc dot gnu.org
  2015-05-12 20:38 ` vmakarov at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-05-08 18:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #9 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
Created attachment 35503
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35503&action=edit
ira-hook.patch

Here is the patch.  Could you try it and give me your opinion about it. 
Thanks.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (7 preceding siblings ...)
  2015-05-08 18:29 ` vmakarov at gcc dot gnu.org
@ 2015-05-12 20:38 ` vmakarov at gcc dot gnu.org
  2015-05-13  7:53 ` robert.suchanek at imgtec dot com
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-05-12 20:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #11 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Robert Suchanek from comment #10)
> Hi Vlad,
> 
> I'm pleased with the results so far. In the larger codebase, it behaves as
> the original
> patch reverted and I haven't seen a missed case. 
> 
> The code size doesn't seem to be hurt either. I see ~0.5% improvement on
> average case which is good. Thanks a lot for the patch. I've thrown it
> against standard regression and it will take some time to complete but I'm
> confident it will pass. Initially it failed because of a missing declaration
> of the new_class variable though.

Thanks, Robert.

If it is ok, I will commit it into the trunk this week.  I am going to commit
it without the mips hook.  So I hope mips maintainers will add the hook by
themselves.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (8 preceding siblings ...)
  2015-05-12 20:38 ` vmakarov at gcc dot gnu.org
@ 2015-05-13  7:53 ` robert.suchanek at imgtec dot com
  2015-05-14 14:20 ` wdijkstr at arm dot com
  2015-05-14 20:41 ` vmakarov at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: robert.suchanek at imgtec dot com @ 2015-05-13  7:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #12 from Robert Suchanek <robert.suchanek at imgtec dot com> ---
Thanks Vlad.

The regression in clean on MIPS. I'll send a patch along with the testcase(s)
once the core part it's committed on the trunk.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (9 preceding siblings ...)
  2015-05-13  7:53 ` robert.suchanek at imgtec dot com
@ 2015-05-14 14:20 ` wdijkstr at arm dot com
  2015-05-14 20:41 ` vmakarov at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: wdijkstr at arm dot com @ 2015-05-14 14:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #13 from Wilco <wdijkstr at arm dot com> ---
(In reply to Vladimir Makarov from comment #9)
> Created attachment 35503 [details]
> ira-hook.patch
> 
> Here is the patch.  Could you try it and give me your opinion about it. 
> Thanks.

I tried it out and when forcing ALL_REGS to either GENERAL_REGS or FP_REGS
based on mode it generates significantly smaller spillcode, especially on some
of the high register pressure SPECFP2006 benchmarks like gamess. It is better
than avoiding ALL_REGS if the cost is higher (like the PPC patch mentioned
earlier) - this indicates that the case for preferring ALL_REGS for generic
loads/stores is pretty thin and likely not beneficial overall.

I'm glad with this we're moving towards a more conventional allocation scheme
where the decision of which register class to use is made early rather than
independently for each operand during allocation.

I didn't do a full performance regression test but early results show identical
performance with smaller codesize.

Note that this doesn't solve the lra-constraints issue where it ignores the
allocno class during spilling and just chooses the first variant that matches.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
  2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
                   ` (10 preceding siblings ...)
  2015-05-14 14:20 ` wdijkstr at arm dot com
@ 2015-05-14 20:41 ` vmakarov at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-05-14 20:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862

--- Comment #14 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
Author: vmakarov
Date: Thu May 14 20:40:44 2015
New Revision: 223202

URL: https://gcc.gnu.org/viewcvs?rev=223202&root=gcc&view=rev
Log:
2015-05-14  Vladimir Makarov  <vmakarov@redhat.com>

        PR rtl-optimization/65862
        * target.def (ira_change_pseudo_allocno_class): New hook.
        * targhooks.c (default_ira_change_pseudo_allocno_class): Default
        value of the hook.
        * targhooks.h (default_ira_change_pseudo_allocno_class): New
        extern
        * doc/tm.texi.in (TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS): Add the
        hook.
        * ira-costs.c (find_costs_and_classes): Call the hook and change
        classes when it is necessary.
        * doc/tm.texi: Update.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/doc/tm.texi
    trunk/gcc/doc/tm.texi.in
    trunk/gcc/ira-costs.c
    trunk/gcc/target.def
    trunk/gcc/targhooks.c
    trunk/gcc/targhooks.h


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-05-14 20:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-23 16:04 [Bug rtl-optimization/65862] New: [MIPS] IRA/LRA issue: integers spilled to floating-point registers robert.suchanek at imgtec dot com
2015-04-23 16:09 ` [Bug rtl-optimization/65862] " pinskia at gcc dot gnu.org
2015-04-23 16:54 ` robert.suchanek at imgtec dot com
2015-04-27 16:17 ` vmakarov at gcc dot gnu.org
2015-04-27 17:13 ` wdijkstr at arm dot com
2015-05-06 16:13 ` robert.suchanek at imgtec dot com
2015-05-06 16:21 ` matthew.fortune at imgtec dot com
2015-05-07 18:55 ` vmakarov at gcc dot gnu.org
2015-05-08 18:29 ` vmakarov at gcc dot gnu.org
2015-05-12 20:38 ` vmakarov at gcc dot gnu.org
2015-05-13  7:53 ` robert.suchanek at imgtec dot com
2015-05-14 14:20 ` wdijkstr at arm dot com
2015-05-14 20:41 ` vmakarov at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).