public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: Address Arithmetic Improvement
@ 2003-04-10 10:13 Rakesh Kumar - Software, Noida
  0 siblings, 0 replies; 3+ messages in thread
From: Rakesh Kumar - Software, Noida @ 2003-04-10 10:13 UTC (permalink / raw)
  To: tm_gccmail; +Cc: gcc, Alexandre Oliva

Hi,

> I was thinking about this problem a few weeks ago, and it may be possible
> to use the peep2 pass to opportunistically combine multiple adds into a
> single add instruction, such as

peep2 works for adjacent instructions only. Defining one peep2 
will not suffice for all cases. For instance, in your example:

	fmov.s	@r0+,fr0
	fmov.s	@r0,fr1
	add	#-4,r0
...						<-- Here
	add	#64,r0
	fmov.s	@r0+,fr2
	fmov.s	@r0,fr3

At the location marked Here, there can be multiple instructions. Now, if we
are considering a peep window of say 4 instructions, we need to define peep2
for 3 cases eventually, naming

1.	add #-4, r0
	add #64, r0

2.	add #-4, r0
	one insn not ending the basic block and not using r0 (use
match_insn)
	add #64, r0

3.	add #-4, r0
	two insns with same constraints
	add #64, r0

As we increase the peephole window, we need to increase the number of
peepholes as well.

It doesn't seem that it solves the problem since there might be instructions
which 
may not come in that window but may exist in the same basic block.
As the insns are split during the postrelad pass, there is possibility of
such cases.

I suppose there are two solutions to this problem:

1. We could develop a new pass that improves the address arithmetic. 
   The disadvantage of this is the increased compile time.

2. One more cleaner solution is to improve the implementation of -fmovd on
SH4.
   This flag is set for making double-sized floating point moves but at
present it is not 
   supported properly. If set, this should not eventually cause those DF
splits and no more   
   address arithmetic instructions are generated.

What do you think?

Thanks and Best Regards,
Rakesh Kumar

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Address Arithmetic Improvement
  2003-04-09 14:53 Rakesh Kumar - Software, Noida
@ 2003-04-09 20:24 ` tm_gccmail
  0 siblings, 0 replies; 3+ messages in thread
From: tm_gccmail @ 2003-04-09 20:24 UTC (permalink / raw)
  To: Rakesh Kumar - Software, Noida; +Cc: gcc, Alexandre Oliva, Jan Hubicka

On Wed, 9 Apr 2003, Rakesh Kumar - Software, Noida wrote:

> Processor: SH4
> Mode: Little Endian
>  
> On compiling the attached program with -O2, I observed that reload pass
> splits up the DF mode address arithmetic insn into SF mode insns using FPSCR
> register.
>  
> I believe SH4 has the restriction that in little endian, we can't use
> double-precision floating-point moves between memory and registers.
>  
> But generating two insns, where one would suffice, encourage redundancy.
> Also to adjust the pointers, we are generating additional insns as defined
> in define_split in sh.md.
>  
> I propose that one function could be developed which scans the basic block
> and combines address arithmetic insns. But I'm not clear about where to put
> it so that it should not affect the present compilation process.
>  
> Can anybody there suggest me some alternative?
>  
> Thanks in advance
> Rakesh Kumar
> 

I think regmove is supposed to do this, but it runs too early to catch the
post-reload DFmode splits.

I was thinking about this problem a few weeks ago, and it may be possible
to use the peep2 pass to opportunistically combine multiple adds into a
single add instruction, such as:

	fmov.s	@r0+,fr0
	fmov.s	@r0,fr1
	add	#-4,r0
...
	add	#64,r0
	fmov.s	@r0+,fr2
	fmov.s	@r0,fr3

into

	fmov.s	@r0+,fr0
	fmov.s	@r0,fr1
...
	add	#60,r0
	fmov.s	@r0+,fr2
	fmov.s	@r0,fr3

Toshi

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Address Arithmetic Improvement
@ 2003-04-09 14:53 Rakesh Kumar - Software, Noida
  2003-04-09 20:24 ` tm_gccmail
  0 siblings, 1 reply; 3+ messages in thread
From: Rakesh Kumar - Software, Noida @ 2003-04-09 14:53 UTC (permalink / raw)
  To: gcc; +Cc: Alexandre Oliva, Jan Hubicka


[-- Attachment #1.1: Type: text/plain, Size: 812 bytes --]

Processor: SH4
Mode: Little Endian
 
On compiling the attached program with -O2, I observed that reload pass
splits up the DF mode address arithmetic insn into SF mode insns using FPSCR
register.
 
I believe SH4 has the restriction that in little endian, we can't use
double-precision floating-point moves between memory and registers.
 
But generating two insns, where one would suffice, encourage redundancy.
Also to adjust the pointers, we are generating additional insns as defined
in define_split in sh.md.
 
I propose that one function could be developed which scans the basic block
and combines address arithmetic insns. But I'm not clear about where to put
it so that it should not affect the present compilation process.
 
Can anybody there suggest me some alternative?
 
Thanks in advance
Rakesh Kumar

[-- Attachment #1.2: Type: text/html, Size: 1622 bytes --]

[-- Attachment #2: address.c --]
[-- Type: application/octet-stream, Size: 154 bytes --]

#include <stdarg.h>

void func(char *, ...);

void func2(int *a, double *b)
{
   func(".", a[0], b[0], a[1], b[1], a[2], b[2], a[3], b[3], a[4], b[4]);
}

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-04-10  7:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-10 10:13 Address Arithmetic Improvement Rakesh Kumar - Software, Noida
  -- strict thread matches above, loose matches on Subject: below --
2003-04-09 14:53 Rakesh Kumar - Software, Noida
2003-04-09 20:24 ` tm_gccmail

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).