public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* IRA vs CANNOT_CHANGE_MODE_CLASS, + 4.7 IRA regressions?
@ 2011-07-28  8:11 Sandra Loosemore
  2011-07-28 14:24 ` Richard Guenther
  2011-07-29 18:57 ` Vladimir Makarov
  0 siblings, 2 replies; 6+ messages in thread
From: Sandra Loosemore @ 2011-07-28  8:11 UTC (permalink / raw)
  To: gcc

Consider this bit of code:

extern double a[20];

double test1 (int n)
{
   double accum = 0.0;
   int i;

   for (i=0; i<n; i++) { accum -= a[i]; }
   accum = fabs (accum);
   return accum;
}

which is compiled for MIPS using

mipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops -march=74kf1_1 
-S abstest.c

With a GCC 4.6 compiler, this produces:
...
.L3:
	mtc1	$3,$f2
	ldc1	$f0,0($5)
	addiu	$5,$5,8
	mtc1	$2,$f3
	sub.d	$f2,$f2,$f0
	mfc1	$3,$f2
	bne	$5,$4,.L3
	mfc1	$2,$f3

	ext	$5,$2,0,31
	move	$4,$3
.L2:
	mtc1	$4,$f0
	j	$31
	mtc1	$5,$f1
...

This is terrible code, with all that pointless register-shuffling inside 
the loop -- what's gone wrong?  Well, the bit-twiddling expansion of 
"fabs" produced by optabs.c uses subreg expressions, and on MIPS 
CANNOT_CHANGE_MODE_CLASS disallows use of FP registers for integer 
operations.  And, when IRA sees that, it decides it cannot alloc "accum" 
to a FP reg at all, even if it obviously makes sense to put it there for 
the rest of its lifetime.

On mainline trunk, things are even worse as it's spilling to memory, not 
just shuffling between registers:

.L3:
	ldc1	$f0,0($2)
	addiu	$2,$2,8
	sub.d	$f2,$f2,$f0
	bne	$2,$3,.L3
	sdc1	$f2,0($sp)

	lw	$2,0($sp)
	ext	$3,$2,0,31
	lw	$2,4($sp)
.L2:
	sw	$2,4($sp)
	sw	$3,0($sp)
	lw	$3,4($sp)
	lw	$2,0($sp)
	addiu	$sp,$sp,8
	mtc1	$3,$f0
	j	$31
	mtc1	$2,$f1

I've been experimenting with a patch to the MIPS backend to add 
define_insn_and_split patterns for floating-point abs -- the idea is to 
attach some constraints to the insns to tell IRA it needs a GP reg for 
these operations, so it can apply its usual cost analysis and reload 
logic instead of giving up.  Then the split to introduce the subreg 
expansion happens after reload when we already have the right register 
class.  This seems to work well enough on 4.6; for this particular 
example, I'm getting:

.L3:
	ldc1	$f2,0($2)
	addiu	$2,$2,8
	bne	$2,$4,.L3
	sub.d	$f0,$f0,$f2

	mfc1	$2,$f1
	ext	$2,$2,0,31
	j	$31
	mtc1	$2,$f1

However, same patch on mainline is still giving spills to memory.  :-(

So, here's my question.  Is it worthwhile for me to continue this 
approach of trying to make the MIPS backend smarter?  Or is the way IRA 
deals with CANNOT_CHANGE_MODE_CLASS fundamentally broken and in need of 
fixing in a target-inspecific way?  And/or is there some other 
regression in IRA on mainline that's causing it to spill to memory when 
it didn't used to in 4.6?

BTW, the unary "neg" operator has the same problem as "abs" on MIPS; 
can't use the hardware instruction because it does the wrong thing with 
NaNs, and can't twiddle the sign bit directly in a FP register.  With 
both abs/neg now generating unnecessary memory spills, this seems like a 
fairly important performance regression....

-Sandra

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-08-01 20:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-28  8:11 IRA vs CANNOT_CHANGE_MODE_CLASS, + 4.7 IRA regressions? Sandra Loosemore
2011-07-28 14:24 ` Richard Guenther
2011-07-28 14:47   ` Sandra Loosemore
2011-07-28 22:47     ` Jeff Law
2011-07-29 18:57 ` Vladimir Makarov
2011-08-01 20:03   ` Sandra Loosemore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).