public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
[not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
@ 2006-09-24 19:52 ` falk at debian dot org
2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: falk at debian dot org @ 2006-09-24 19:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from falk at debian dot org 2006-09-24 19:52 -------
For this test case:
void f(double *pds, double *pdd, unsigned long len) {
while (len >= 8*sizeof(double)) {
register double r1,r2,r3,r4;
r1 = *pds++;
r2 = *pds++;
r3 = *pds++;
r4 = *pds++;
*pdd++ = r1;
*pdd++ = r2;
*pdd++ = r3;
*pdd++ = r4;
}
}
gcc starting from 4.0 produces this:
.L3:
fldds -16(%r26),%fr22
fldds -8(%r26),%fr23
fldds 0(%r26),%fr24
fldds 8(%r26),%fr25
ldo 32(%r26),%r26
fstds %fr22,-16(%r25)
fstds %fr23,-8(%r25)
fstds %fr24,0(%r25)
fstds %fr25,8(%r25)
b .L3
which I suspect is actually better, since it avoids dependencies between the
loads. But I'm not familiar with hppa, can anybody comment?
--
falk at debian dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |3.4.2 4.1.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
[not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
2006-09-24 19:52 ` [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores falk at debian dot org
@ 2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
2006-09-24 23:48 ` randolph at tausq dot org
2006-09-24 23:49 ` tausq at debian dot org
3 siblings, 0 replies; 5+ messages in thread
From: dave at hiauly1 dot hia dot nrc dot ca @ 2006-09-24 22:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from dave at hiauly1 dot hia dot nrc dot ca 2006-09-24 22:15 -------
Subject: Re: [hppa] Missing address increment optimization for fp load/stores
> For this test case:
>
> void f(double *pds, double *pdd, unsigned long len) {
> while (len >= 8*sizeof(double)) {
> register double r1,r2,r3,r4;
> r1 = *pds++;
> r2 = *pds++;
> r3 = *pds++;
> r4 = *pds++;
> *pdd++ = r1;
> *pdd++ = r2;
> *pdd++ = r3;
> *pdd++ = r4;
> }
> }
>
> gcc starting from 4.0 produces this:
>
> .L3:
> fldds -16(%r26),%fr22
> fldds -8(%r26),%fr23
> fldds 0(%r26),%fr24
> fldds 8(%r26),%fr25
> ldo 32(%r26),%r26
> fstds %fr22,-16(%r25)
> fstds %fr23,-8(%r25)
> fstds %fr24,0(%r25)
> fstds %fr25,8(%r25)
> b .L3
>
> which I suspect is actually better, since it avoids dependencies between the
> loads. But I'm not familiar with hppa, can anybody comment?
It looks close to optimal to me. The code is better than that generated
by 3.4.x or HP cc. Using the auto-increment forms would allow elimination
of the two ldo instructions to increment r25 and r26.
Dave
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
[not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
2006-09-24 19:52 ` [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores falk at debian dot org
2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
@ 2006-09-24 23:48 ` randolph at tausq dot org
2006-09-24 23:49 ` tausq at debian dot org
3 siblings, 0 replies; 5+ messages in thread
From: randolph at tausq dot org @ 2006-09-24 23:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from randolph at tausq dot org 2006-09-24 23:48 -------
Subject: Re: [hppa] Missing address increment
optimization for fp load/stores
>> gcc starting from 4.0 produces this:
>>
>> .L3:
>> fldds -16(%r26),%fr22
>> fldds -8(%r26),%fr23
>> fldds 0(%r26),%fr24
>> fldds 8(%r26),%fr25
>> ldo 32(%r26),%r26
>> fstds %fr22,-16(%r25)
>> fstds %fr23,-8(%r25)
>> fstds %fr24,0(%r25)
>> fstds %fr25,8(%r25)
>> b .L3
>>
>> which I suspect is actually better, since it avoids dependencies between the
>> loads. But I'm not familiar with hppa, can anybody comment?
>
> It looks close to optimal to me. The code is better than that generated
> by 3.4.x or HP cc. Using the auto-increment forms would allow elimination
> of the two ldo instructions to increment r25 and r26.
Yeah, this looks pretty good. I've been told that not using the
autoincrement forms might be even better as it avoids interlocks between
successive instructions. The ldo insn just gets pipelined so it doesn't
necessarily slow things down.
I'll mark this bug as resolved.
thanks
randolph
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
[not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2006-09-24 23:48 ` randolph at tausq dot org
@ 2006-09-24 23:49 ` tausq at debian dot org
3 siblings, 0 replies; 5+ messages in thread
From: tausq at debian dot org @ 2006-09-24 23:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from tausq at debian dot org 2006-09-24 23:49 -------
Fixed in gcc-4.x
--
tausq at debian dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
2004-09-01 18:22 [Bug rtl-optimization/17264] New: " tausq at debian dot org
@ 2004-09-01 18:53 ` danglin at gcc dot gnu dot org
0 siblings, 0 replies; 5+ messages in thread
From: danglin at gcc dot gnu dot org @ 2004-09-01 18:53 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |danglin at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-09-24 23:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
2006-09-24 19:52 ` [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores falk at debian dot org
2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
2006-09-24 23:48 ` randolph at tausq dot org
2006-09-24 23:49 ` tausq at debian dot org
2004-09-01 18:22 [Bug rtl-optimization/17264] New: " tausq at debian dot org
2004-09-01 18:53 ` [Bug rtl-optimization/17264] " danglin at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).