public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
       [not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
@ 2006-09-24 19:52 ` falk at debian dot org
  2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: falk at debian dot org @ 2006-09-24 19:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from falk at debian dot org  2006-09-24 19:52 -------
For this test case:

void f(double *pds, double *pdd, unsigned long len) {
  while (len >= 8*sizeof(double)) {
    register double r1,r2,r3,r4;
    r1 = *pds++;
    r2 = *pds++;
    r3 = *pds++;
    r4 = *pds++;
    *pdd++ = r1;
    *pdd++ = r2;
    *pdd++ = r3;
    *pdd++ = r4;
  }
}

gcc starting from 4.0 produces this:

.L3:
        fldds -16(%r26),%fr22
        fldds -8(%r26),%fr23
        fldds 0(%r26),%fr24
        fldds 8(%r26),%fr25
        ldo 32(%r26),%r26
        fstds %fr22,-16(%r25)
        fstds %fr23,-8(%r25)
        fstds %fr24,0(%r25)
        fstds %fr25,8(%r25)
        b .L3

which I suspect is actually better, since it avoids dependencies between the
loads. But I'm not familiar with hppa, can anybody comment?


-- 

falk at debian dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |3.4.2 4.1.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
       [not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
  2006-09-24 19:52 ` [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores falk at debian dot org
@ 2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
  2006-09-24 23:48 ` randolph at tausq dot org
  2006-09-24 23:49 ` tausq at debian dot org
  3 siblings, 0 replies; 5+ messages in thread
From: dave at hiauly1 dot hia dot nrc dot ca @ 2006-09-24 22:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from dave at hiauly1 dot hia dot nrc dot ca  2006-09-24 22:15 -------
Subject: Re:  [hppa] Missing address increment optimization for fp load/stores

> For this test case:
> 
> void f(double *pds, double *pdd, unsigned long len) {
>   while (len >= 8*sizeof(double)) {
>     register double r1,r2,r3,r4;
>     r1 = *pds++;
>     r2 = *pds++;
>     r3 = *pds++;
>     r4 = *pds++;
>     *pdd++ = r1;
>     *pdd++ = r2;
>     *pdd++ = r3;
>     *pdd++ = r4;
>   }
> }
> 
> gcc starting from 4.0 produces this:
> 
> .L3:
>         fldds -16(%r26),%fr22
>         fldds -8(%r26),%fr23
>         fldds 0(%r26),%fr24
>         fldds 8(%r26),%fr25
>         ldo 32(%r26),%r26
>         fstds %fr22,-16(%r25)
>         fstds %fr23,-8(%r25)
>         fstds %fr24,0(%r25)
>         fstds %fr25,8(%r25)
>         b .L3
> 
> which I suspect is actually better, since it avoids dependencies between the
> loads. But I'm not familiar with hppa, can anybody comment?

It looks close to optimal to me.  The code is better than that generated
by 3.4.x or HP cc.  Using the auto-increment forms would allow elimination
of the two ldo instructions to increment r25 and r26.

Dave


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
       [not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
  2006-09-24 19:52 ` [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores falk at debian dot org
  2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
@ 2006-09-24 23:48 ` randolph at tausq dot org
  2006-09-24 23:49 ` tausq at debian dot org
  3 siblings, 0 replies; 5+ messages in thread
From: randolph at tausq dot org @ 2006-09-24 23:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from randolph at tausq dot org  2006-09-24 23:48 -------
Subject: Re:  [hppa] Missing address increment
 optimization for fp load/stores

>> gcc starting from 4.0 produces this:
>>
>> .L3:
>>         fldds -16(%r26),%fr22
>>         fldds -8(%r26),%fr23
>>         fldds 0(%r26),%fr24
>>         fldds 8(%r26),%fr25
>>         ldo 32(%r26),%r26
>>         fstds %fr22,-16(%r25)
>>         fstds %fr23,-8(%r25)
>>         fstds %fr24,0(%r25)
>>         fstds %fr25,8(%r25)
>>         b .L3
>>
>> which I suspect is actually better, since it avoids dependencies between the
>> loads. But I'm not familiar with hppa, can anybody comment?
> 
> It looks close to optimal to me.  The code is better than that generated
> by 3.4.x or HP cc.  Using the auto-increment forms would allow elimination
> of the two ldo instructions to increment r25 and r26.

Yeah, this looks pretty good. I've been told that not using the 
autoincrement forms might be even better as it avoids interlocks between 
successive instructions. The ldo insn just gets pipelined so it doesn't 
necessarily slow things down.

I'll mark this bug as resolved.

thanks
randolph


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
       [not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2006-09-24 23:48 ` randolph at tausq dot org
@ 2006-09-24 23:49 ` tausq at debian dot org
  3 siblings, 0 replies; 5+ messages in thread
From: tausq at debian dot org @ 2006-09-24 23:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from tausq at debian dot org  2006-09-24 23:49 -------
Fixed in gcc-4.x


-- 

tausq at debian dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
  2004-09-01 18:22 [Bug rtl-optimization/17264] New: " tausq at debian dot org
@ 2004-09-01 18:53 ` danglin at gcc dot gnu dot org
  0 siblings, 0 replies; 5+ messages in thread
From: danglin at gcc dot gnu dot org @ 2004-09-01 18:53 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |danglin at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-09-24 23:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-17264-581@http.gcc.gnu.org/bugzilla/>
2006-09-24 19:52 ` [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores falk at debian dot org
2006-09-24 22:15 ` dave at hiauly1 dot hia dot nrc dot ca
2006-09-24 23:48 ` randolph at tausq dot org
2006-09-24 23:49 ` tausq at debian dot org
2004-09-01 18:22 [Bug rtl-optimization/17264] New: " tausq at debian dot org
2004-09-01 18:53 ` [Bug rtl-optimization/17264] " danglin at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).