* RE: Address Arithmetic Improvement
@ 2003-04-10 10:13 Rakesh Kumar - Software, Noida
0 siblings, 0 replies; 3+ messages in thread
From: Rakesh Kumar - Software, Noida @ 2003-04-10 10:13 UTC (permalink / raw)
To: tm_gccmail; +Cc: gcc, Alexandre Oliva
Hi,
> I was thinking about this problem a few weeks ago, and it may be possible
> to use the peep2 pass to opportunistically combine multiple adds into a
> single add instruction, such as
peep2 works for adjacent instructions only. Defining one peep2
will not suffice for all cases. For instance, in your example:
fmov.s @r0+,fr0
fmov.s @r0,fr1
add #-4,r0
... <-- Here
add #64,r0
fmov.s @r0+,fr2
fmov.s @r0,fr3
At the location marked Here, there can be multiple instructions. Now, if we
are considering a peep window of say 4 instructions, we need to define peep2
for 3 cases eventually, naming
1. add #-4, r0
add #64, r0
2. add #-4, r0
one insn not ending the basic block and not using r0 (use
match_insn)
add #64, r0
3. add #-4, r0
two insns with same constraints
add #64, r0
As we increase the peephole window, we need to increase the number of
peepholes as well.
It doesn't seem that it solves the problem since there might be instructions
which
may not come in that window but may exist in the same basic block.
As the insns are split during the postrelad pass, there is possibility of
such cases.
I suppose there are two solutions to this problem:
1. We could develop a new pass that improves the address arithmetic.
The disadvantage of this is the increased compile time.
2. One more cleaner solution is to improve the implementation of -fmovd on
SH4.
This flag is set for making double-sized floating point moves but at
present it is not
supported properly. If set, this should not eventually cause those DF
splits and no more
address arithmetic instructions are generated.
What do you think?
Thanks and Best Regards,
Rakesh Kumar
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Address Arithmetic Improvement
2003-04-09 14:53 Rakesh Kumar - Software, Noida
@ 2003-04-09 20:24 ` tm_gccmail
0 siblings, 0 replies; 3+ messages in thread
From: tm_gccmail @ 2003-04-09 20:24 UTC (permalink / raw)
To: Rakesh Kumar - Software, Noida; +Cc: gcc, Alexandre Oliva, Jan Hubicka
On Wed, 9 Apr 2003, Rakesh Kumar - Software, Noida wrote:
> Processor: SH4
> Mode: Little Endian
>
> On compiling the attached program with -O2, I observed that reload pass
> splits up the DF mode address arithmetic insn into SF mode insns using FPSCR
> register.
>
> I believe SH4 has the restriction that in little endian, we can't use
> double-precision floating-point moves between memory and registers.
>
> But generating two insns, where one would suffice, encourage redundancy.
> Also to adjust the pointers, we are generating additional insns as defined
> in define_split in sh.md.
>
> I propose that one function could be developed which scans the basic block
> and combines address arithmetic insns. But I'm not clear about where to put
> it so that it should not affect the present compilation process.
>
> Can anybody there suggest me some alternative?
>
> Thanks in advance
> Rakesh Kumar
>
I think regmove is supposed to do this, but it runs too early to catch the
post-reload DFmode splits.
I was thinking about this problem a few weeks ago, and it may be possible
to use the peep2 pass to opportunistically combine multiple adds into a
single add instruction, such as:
fmov.s @r0+,fr0
fmov.s @r0,fr1
add #-4,r0
...
add #64,r0
fmov.s @r0+,fr2
fmov.s @r0,fr3
into
fmov.s @r0+,fr0
fmov.s @r0,fr1
...
add #60,r0
fmov.s @r0+,fr2
fmov.s @r0,fr3
Toshi
^ permalink raw reply [flat|nested] 3+ messages in thread
* Address Arithmetic Improvement
@ 2003-04-09 14:53 Rakesh Kumar - Software, Noida
2003-04-09 20:24 ` tm_gccmail
0 siblings, 1 reply; 3+ messages in thread
From: Rakesh Kumar - Software, Noida @ 2003-04-09 14:53 UTC (permalink / raw)
To: gcc; +Cc: Alexandre Oliva, Jan Hubicka
[-- Attachment #1.1: Type: text/plain, Size: 812 bytes --]
Processor: SH4
Mode: Little Endian
On compiling the attached program with -O2, I observed that reload pass
splits up the DF mode address arithmetic insn into SF mode insns using FPSCR
register.
I believe SH4 has the restriction that in little endian, we can't use
double-precision floating-point moves between memory and registers.
But generating two insns, where one would suffice, encourage redundancy.
Also to adjust the pointers, we are generating additional insns as defined
in define_split in sh.md.
I propose that one function could be developed which scans the basic block
and combines address arithmetic insns. But I'm not clear about where to put
it so that it should not affect the present compilation process.
Can anybody there suggest me some alternative?
Thanks in advance
Rakesh Kumar
[-- Attachment #1.2: Type: text/html, Size: 1622 bytes --]
[-- Attachment #2: address.c --]
[-- Type: application/octet-stream, Size: 154 bytes --]
#include <stdarg.h>
void func(char *, ...);
void func2(int *a, double *b)
{
func(".", a[0], b[0], a[1], b[1], a[2], b[2], a[3], b[3], a[4], b[4]);
}
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-04-10 7:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-10 10:13 Address Arithmetic Improvement Rakesh Kumar - Software, Noida
-- strict thread matches above, loose matches on Subject: below --
2003-04-09 14:53 Rakesh Kumar - Software, Noida
2003-04-09 20:24 ` tm_gccmail
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).