public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/37194]  New: Autovectorization of constant iteration loop degrades performance
@ 2008-08-21 19:23 pthaugen at gcc dot gnu dot org
  2008-08-21 19:33 ` [Bug tree-optimization/37194] Autovectorization of small " pinskia at gcc dot gnu dot org
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: pthaugen at gcc dot gnu dot org @ 2008-08-21 19:23 UTC (permalink / raw)
  To: gcc-bugs

Seeing a degradation in cpu2000 benchmark 252.eon that is caused by
autovectorization of a simple loop in function ggSpectrum::Set(float).

Here's a simple C version.

void ggSpectrum_Set(float * data, float d) {
   int i;
   for (i = 0; i < 8; i++)
      data[i] = d;
}


When compiled with -O3 -mcpu=970 the following code is generated:

ggSpectrum_Set:
        mfvrsave 0
        stwu 1,-48(1)
        stw 0,44(1)
        oris 0,0,0x8000
        mtvrsave 0
        li 10,0
        rlwinm 0,3,30,30,31
        subfic 0,0,4
        andi. 9,0,3
        beq- 0,.L16
        mtctr 9
        .p2align 4,,15
.L10:
        slwi 0,10,2
        addi 10,10,1
        stfsx 1,3,0
        subfic 8,10,8
        bdnz .L10
.L3:
        subfic 6,9,8
        srwi 0,6,2
        slwi. 7,0,2
        beq- 0,.L5
        mtctr 0
        stfs 1,16(1)
        cmpwi 7,0,0
        li 0,16
        slwi 9,9,2
        li 11,0
        add 9,3,9
        lvewx 0,1,0
        vspltw 0,0,0
        beq- 7,.L17
        .p2align 4,,15
.L6:
        slwi 0,11,4
        addi 11,11,1
        stvx 0,9,0
        bdnz .L6
        cmpw 7,6,7
        subf 8,7,8
        add 10,10,7
        beq- 7,.L9
.L5:
        mtctr 8
        slwi 0,10,2
        add 3,3,0
        .p2align 4,,15
.L8:
        stfs 1,0(3)
        addi 3,3,4
        bdnz .L8
.L9:
        lwz 12,44(1)
        mtvrsave 12
        addi 1,1,48
        blr
.L16:
        mr 10,9
        li 8,8
        b .L3
.L17:
        li 0,1
        mtctr 0
        b .L6


Adding -mno-altivec results in this simpler sequence, and a significant boost
in performance (~40% speedup for the benchmark):

ggSpectrum_Set:
        stfs 1,28(3)
        stfs 1,0(3)
        stfs 1,4(3)
        stfs 1,8(3)
        stfs 1,12(3)
        stfs 1,16(3)
        stfs 1,20(3)
        stfs 1,24(3)
        blr


Another thing that stood out from the benchmark run was that the code was
taking a pretty big hit on a couple of the statically predicted branches
(apparently the address was already 16 byte aligned a lot of the time). So it
seems like it would be best to remove the static prediction and let the
hardware prediction take over.


-- 
           Summary: Autovectorization of constant iteration loop degrades
                    performance
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: pthaugen at gcc dot gnu dot org
 GCC build triplet: powerpc64-linux
  GCC host triplet: powerpc64-linux
GCC target triplet: powerpc64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-01-11  7:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-21 19:23 [Bug tree-optimization/37194] New: Autovectorization of constant iteration loop degrades performance pthaugen at gcc dot gnu dot org
2008-08-21 19:33 ` [Bug tree-optimization/37194] Autovectorization of small " pinskia at gcc dot gnu dot org
2008-08-22  9:54 ` rguenth at gcc dot gnu dot org
2008-08-22 13:33 ` dorit at gcc dot gnu dot org
2008-12-27  5:55 ` pinskia at gcc dot gnu dot org
2008-12-27  5:58 ` [Bug tree-optimization/37194] [4.3/4.4 Regression] " pinskia at gcc dot gnu dot org
2008-12-29 21:58 ` rguenth at gcc dot gnu dot org
2008-12-30 14:58 ` irar at il dot ibm dot com
2009-01-05 13:58 ` irar at il dot ibm dot com
2009-01-08  8:00 ` irar at gcc dot gnu dot org
2009-01-08  9:23 ` jakub at gcc dot gnu dot org
2009-01-08  9:24 ` cnstar9988 at gmail dot com
2009-01-08  9:26 ` irar at il dot ibm dot com
2009-01-11  7:55 ` [Bug tree-optimization/37194] [4.3 " irar at gcc dot gnu dot org
2009-01-11  7:57 ` irar at il dot ibm dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).